How to design AI whistleblower legislation

If you follow the public discourse around AI governance at all (and, since you’re reading this, the odds of that are pretty good) you may have noticed that people tend to gravitate towards abstract debates about whether AI “regulation,” generally, is a good or a bad idea. The two camps were at each other’s throats in 2024 over California SB 1047, and before that bill was vetoed it wasn’t uncommon to see long arguments, ostensibly about the bill, that contained almost zero discussion of any of the actual things that the bill did.

That’s to be expected, of course. Reading statutes cover-to-cover can be a boring and confusing chore, especially if you’re not a lawyer, and it’s often reasonable to have a strong opinion on the big-picture question (“is frontier AI regulation good?”) without having similarly confident takes about the fine details of any specific proposal. But zooming in and evaluating specific proposals on their own merits has its advantages—not the least of which is that it sometimes reveals a surprising amount of consensus around certain individual policy ideas that seem obviously sensible. 

One such idea is strengthening whistleblower protections for employees at frontier AI companies. Even among typically anti-regulation industry figures, whistleblower legislation has proven less controversial than one might have expected. For example, SB 53, a recent state bill that would expand the scope of the protection offered to AI whistleblowers in California, has met with approval from some prominent opponents of its vetoed predecessor, SB 1047. The Working Group on frontier AI that Governor Newsom appointed after he vetoed SB 1047 also included a section on the importance of protecting whistleblowers in its draft report

There also seems to be some level of potential bipartisan support for whistleblower protection legislation at the federal level. Federal AI legislation has been slow in coming; hundreds of bills have been proposed, but so far nothing significant has actually been enacted. Whistleblower laws, which are plausibly useful for mitigating a wide variety of risks, minimally burdensome to  industry, and easy to implement and enforce, seem like a promising place to start. And while whistleblower laws have sometimes been viewed in the past as Democrat-coded pro-labor measures, the increase in conservative skepticism of big tech companies in recent years and the highly public controversy regarding the restrictive contracts that OpenAI pressured departing employees to sign in 2024 seem to have given rise to some interest in protecting AI whistleblowers from the other side of the aisle as well. 

Okay, so now you’re sold on the value of AI whistleblower legislation. Naturally, the next step is to join the growing chorus of voices desperately crying out for a medium-dive LawAI blog post explaining the scope of the protections that AI whistleblowers currently enjoy, the gaps that need to be addressed by future legislation, and the key decision points that state and federal lawmakers designing whistleblower statutes will confront. Don’t worry, we’re all over it. 

1. What do whistleblower laws do? 

The basic idea behind whistleblower protection laws is that employers shouldn’t be allowed to retaliate against employees who disclose important information about corporate wrongdoing through the proper channels. The core example of the kind of behavior that whistleblower laws are meant to protect is that of an employee who notices that his employer is breaking the law and reports the crime to the authorities. In that situation, it’s generally accepted that allowing the employer to fire (or otherwise retaliate against) the employee for blowing the whistle would discourage people from coming forward in the future. In other words, the public’s interest in enforcing laws justifies a bit of interference with freedom of contract in order to prevent retaliation against whistleblowers. Typically, the remedy available to a whistleblower who has been retaliated against is that they can sue the employer, or file an administrative complaint with a government agency, seeking compensation for whatever harm they’ve suffered—often in the form of a monetary payment, or being given back the job from which they were fired. 

Whistleblowing can take many forms that don’t perfectly conform to that core example of an employee reporting some law violation by their employer to the government. For instance, the person reporting the violation might be an independent contractor rather than an employee, or might report some bad or dangerous action that didn’t technically violate the law, or might report their information internally within the company or to a media outlet rather than to the government. Whether these disclosures are protected by law depends on a number of factors.

2. What protections do AI whistleblowers in the U.S. currently have?

Currently, whistleblowers in the U.S. are protected (or, as the case may be, unprotected) by a patchwork of overlapping state and federal statutes, judicially created doctrines, and internal company policies. By default, private sector whistleblowers[ref 1] are not protected from retaliation by any federal statute, although they may be covered by state whistleblower protections and/or judicially created anti-retaliation doctrines. However, there are a number of industry- and subject-matter-specific federal statutes that protect certain whistleblowers from retaliation. For example, the Federal Railroad Safety Act protects railroad employees from being retaliated against for reporting violations of federal law relating to railroad safety or gross misuse of railroad-related federal funds; the Food Safety Modernization Act affords comparable protections  to employees of food packing, processing, manufacturing, and transporting companies; and the Occupational Safety and Health Act prohibits employers generally from retaliating against employees for filing OSHA complaints. 

The scope of the protections afforded by these statutes varies, as do the remedies that each statute provides to employees who have been retaliated against. Some only cover employees who report violations of federal laws or regulations to the proper authorities; others cover a broader range of whistleblowing activity, such as reporting dangerous conditions even when they don’t arise from any violation of a law or rule. Most allow employees who have been retaliated against either to file a complaint with OSHA or to sue the offending employer for damages in federal court, and a few even provide substantial financial incentives for whistleblowers who provide valuable information to the government.[ref 2]

Employees who aren’t covered by any federal statute may still be protected by their state’s whistleblower laws. In the context of the AI industry, the most important state is California, where most of the companies that develop frontier models are headquartered. California’s whistleblower protection statute is quite strong—it protects both public and private employees from retaliation for reporting violations of any state, federal, or local law or regulation to a government agency or internally within their company. It also prohibits employers from adopting any internal policies to prevent employees from whistleblowing. The recently introduced SB 53 would, if enacted, additionally protect employees and contractors working at frontier AI companies from retaliation for reporting information about “critical risk” from AI models.

Even when there are no applicable state or federal statutes, whistleblowers may still be protected by the “common law,” i.e., law created by judicial decisions rather than by legislation. These common law protections vary widely by state, but typically at a minimum prohibit employers from firing employees for a reason that contravenes a clearly established “public policy.”[ref 3] What exactly constitutes a clearly established public policy in a given state depends heavily on the circumstances, but whistleblowing often qualifies when it provides a public benefit, such as increasing public safety or facilitating effective law enforcement. However, it’s often difficult for a whistleblower (even with the assistance of a lawyer) to predict ex ante whether common law protections will apply because so much depends on how a particular court might apply existing law to a particular set of facts. Statutory protections are generally preferable because they provide greater certainty and can cover a broader range of socially desirable whistleblowing behavior. 

3. Restrictions on whistleblowing: nondisclosure agreements and trade secrets

a. Nondisclosure and non-disparagement agreements

The existing protections discussed above are counterbalanced by two legal doctrines that can limit the applicability of anti-retaliation measures: the law of contracts and the law of trade secrets. Employers (especially in the tech industry) often require their employees to sign broad nondisclosure agreements that prohibit the employees from sharing certain confidential information outside of the company. It was this phenomenon—the use of NDAs to silence would-be whistleblowers—that first drew significant legislative and media attention to the issue of AI whistleblowing, when news broke that OpenAI had required departing employees to choose between signing contracts with broad nondisclosure and non-disparagement provisions or giving up their vested equity in the company. Essentially, the provisions would have required former employees to avoid criticizing OpenAI for the rest of their lives, even on the basis of publicly known facts, and even if they did not disclose any confidential information in doing so. In response to these provisions, a number of OpenAI employees and former employees wrote an open letter calling for a “right to warn about artificial intelligence” and had their lawyers write to the SEC arguing that OpenAI’s NDAs violated various securities laws and SEC regulations. 

After news of the NDAs’ existence went public, OpenAI quickly apologized for including the problematic provisions in its exit paperwork and promised to remove the provisions from future contracts. But the underlying legal reality that allowed OpenAI to pressure employees into signing away their right to blow the whistle hasn’t changed. Typically, U.S. law assigns a great deal of value to “freedom of contract,” which means that mentally competent adults are usually allowed to sign away any rights they choose to give up unless the contract in question would violate some important public policy. Courts sometimes hold that NDAs are unenforceable against legitimate whistleblowers because of public policy considerations, but the existence of an NDA can be a powerful deterrent to a potential whistleblower even when there’s some chance that a court would refuse to enforce the contract. 

By default, AI companies still have the power to prevent most kinds of whistleblowing in most jurisdictions by requiring employees to sign restrictive NDAs. And even companies that don’t specifically intend to prevent whistleblowing might take a “better safe than sorry” approach and adopt NDAs so broad and restrictive that they effectively deter whistleblowers. Of course, employees have the option of quitting rather than agreeing to sign, but very few people in the real world seriously consider doing that when they’re filling out hiring paperwork (or when they’re filling out departure paperwork and their employer is threatening to withhold their vested equity, as the case may be). 

b. Trade secret law

Historically, frontier AI developers have often recognized that their work has immense public significance and that the public therefore has a strong interest in access to information about models. However, this interest is sometimes in tension with both the commercial interests of developers and the public’s interest in public safety. This tension is at the heart of the debate over open source vs. closed models, and it gave rise to the ironic closing-off of “OpenAI.” 

The same tension also exists between the public’s interest in protecting whistleblowers and the interests of both companies and the public in protecting trade secrets. An overly broad whistleblower law that protected all employee disclosures related to frontier models would allow companies to steal model weights and algorithmic secrets from their competitors by simply poaching individual employees with access to the relevant information. In addition to being unfair, this would harm innovation in the long run, because a developer has less of an incentive to invest in research if any breakthroughs will shortly become available to its competitors. Furthermore, an overbroad whistleblower law might also actually create risks to public safety if it protected the public disclosure of information about dangerous capabilities that made it easier for bad actors or foreign powers to replicate those capabilities.

A “trade secret” is a piece of information, belonging to a company that makes reasonable efforts to keep it secret, that derives economic value from being kept secret. Wrongfully disclosing trade secrets is illegal under both state and federal law, and employees who disclose trade secrets can be sued or even criminally charged. Since 2016, however, the Defend Trade Secrets Act has provided immunity from both civil and criminal liability for disclosing a trade secret if the disclosure is made “(i) in confidence to a Federal, State, or local government official, either directly or indirectly, or to an attorney; and (ii) solely for the purpose of reporting or investigating a suspected violation of law.” In other words, the status quo for AI whistleblowers is essentially that they can disclose trade secret information only if the information concerns a violation of the law and only if they disclose it confidentially to the government, perhaps via a lawyer.

4. Why is it important to pass new AI whistleblower legislation?

Most of the employees working on the frontier models that are expected to generate many of the most worrying AI risks are located in California and entitled to the protection of California’s robust whistleblower statute.  There are also existing federal and common law statutory protections that might prove relevant in a pinch; the OpenAI whistleblowers, for example, wrote to the SEC arguing that OpenAI’s NDAs violated the SEC’s rule against NDAs that fail to exempt reporting to the SEC about securities violations. However, there are important gaps in existing whistleblower protections that should be addressed by new federal and state legislation. 

Most importantly, the existing California whistleblower statute only protects whistleblowers who report a violation of some law or regulation. But, as a number of existing federal and state laws recognize, there are times when information about significant risks to public safety or national security should be disclosed to the proper authorities even if no law has been broken. Suppose, for example, that internal safety testing demonstrates that a given model can, with a little jailbreaking, be coaxed into providing extremely effective help to a bad actor attempting to manufacture bioweapons. If an AI company chooses to deploy the model anyways, and an employee who worked on safety testing the model wants to bring the risk to the government’s attention through the proper channels, it seems obvious that they should be protected from retaliation for doing so. Unless the company’s actions violated some law or regulation, however, California’s existing whistleblower statute would not apply. To fill this gap, any federal AI whistleblower statute should protect whistleblowers who report information about significant risks from AI systems through the proper channels even if no law has been violated. California’s SB 53 would help to address this issue, but the scope of that statute is so narrow that additional protections would still be useful even if SB 53 is enacted.

Additionally, readers who followed the debate over SB 1047 may recall a number of reasons for preferring a uniform federal policy to a policy that applies only in one state, no matter how important that state is. Not every relevant company is located in California, and there’s no way of knowing for certain where all of the companies that will be important to the development of advanced AI systems in the future will be located. Federal AI whistleblower legislation, if properly scoped, would provide consistency and eliminate the need for an inconsistent patchwork of state protections. 

New whistleblower legislation specifically for AI would also provide clarity to potential whistleblowers and raise the salience of AI whistleblowing. By default, many people who could come forward with potentially valuable information will not do so. Anything that reduces the level of uncertainty potential whistleblowers face and eliminates some of the friction involved in the disclosure process is likely to increase the number of whistleblowers who decide to come forward. Even an employee who would have been covered by existing California law or by common-law protections might be more likely to come forward if they saw, for example, a news item about a new statute that more clearly and precisely established protections for the kind of disclosure being contemplated. In other words, “whistleblowing systems should be universally known and psychologically easy to use – not just technically available.”

5. Key decision points for whistleblower legislation

There are also a number of other gaps in existing law that new state or federal whistleblower legislation could fill. This section discusses three of the most important decision points that lawmakers crafting state or federal AI whistleblower legislation will encounter: whether and how to include a formal reporting process, what the scope of the included protections should be, and whether to prohibit contracts that waive whistleblower protections.[ref 4]

a. Reporting process

Any federal AI whistleblower bill should include a formal reporting process for AI risks. This could take the form of a hotline or a designated government office charged with receiving, processing, and perhaps responding to AI whistleblower disclosures. Existing federal statutes that protect whistleblowers who report on hazardous conditions, such as the Federal Railroad Safety Act and the Surface Transportation Assistance Act, often direct an appropriate agency to promulgate regulations[ref 5] establishing a process by which whistleblowers can report “security problems, deficiencies, or vulnerabilities.” 

The main benefit of this approach would be the creation of a convenient default avenue for reporting, but there would also be incidental benefits.  For example, the existence of a formal government channel for reporting might partially address industry concerns about trade secret protection and the secure processing of sensitive information, especially if the established channel was the only legally protected avenue for reporting. Establishing a reporting process also provides some assurance to whistleblowers that the information they disclose will come to the attention of the government body best equipped to process and respond appropriately to it.[ref 6] Ideally, the agency charged with receiving reports would have preexisting experience with the secure processing of information related to AI security; if the Trump administration elects to allow the Biden administration’s reporting requirements for frontier AI developers to continue in some form, the natural choice would be whatever agency is charged with gathering and processing that information (currently the Department of Commerce’s Bureau of Industry and Security).

b. Scope of protection

Another key decision point for policymakers is the determination of the scope of the protection offered to whistleblowers—in other words, the actions and the actors that should be protected. California’s SB 53, which was clearly drafted to minimize controversy rather than to provide the most robust protection possible, only protects a whistleblower if either:

the whistleblower had “reasonable cause to believe” that they were disclosing information regarding “critical risk,” defined as—

(a) the whistleblower had “reasonable cause to believe” that they were disclosing information regarding “critical risk,” defined as—

  1. a “foreseeable and material risk” of 
  2. killing or seriously injuring more than 100 people or causing at least one billion dollars’ worth of damage, via
  3. one of four specified harm vectors—creating CBRN weapons, a cyberattack, loss of control, or AI model conduct with “limited human intervention” that would be a crime if committed by a human, or

(b) the whistleblower had reasonable cause to believe that their employer had “made false or misleading statements about its management of critical risk”

This is a hard standard to meet. It’s plausible that an AI company employee could be aware of some very serious risk that didn’t threaten a full billion dollars in damage—or even a risk that did threaten hundreds of lives and billions of dollars in damages, but not through one of the four specified threat vectors—and yet not be protected under the statute. Imagine, for example, that internal safety testing at an AI lab showed that a given frontier model could, with a little jailbreaking, provide extremely effective guidance on how to build conventional explosives and use them to execute terrorist attacks. Even if the lab chose not to release this information and issued false public statements about their model’s evaluation results, any potential whistleblower would likely not be protected under SB 53 for reporting this information.

Compare that standard to the one in Illinois’ whistleblower protection statute, which instead protects any employee who discloses information while having a “good faith belief” that the information relates to an activity of their employer that “poses a substantial and specific danger to employees, public health, or safety.”[ref 7] This protection applies to all employees in Illinois,[ref 8] not just employees at frontier AI companies. The federal Whistleblower Protection Act, which applies to federal employees, uses a similar standard—the whistleblower must “reasonably believe” that their disclosure is evidence of a “substantial and specific danger to public health or safety.” 

Both of those laws apply to a far broader category of workers than an industry-specific frontier AI whistleblower statute would, and they both allow the disclosure to be made to a relatively wide range of actors. It doesn’t seem at all unreasonable to suggest that AI whistleblower legislation, whether state or federal, should similarly protect disclosures when the whistleblower believes in good faith that they’re reporting on a “substantial and specific” potential danger to public health, public safety, or national security. If labs are worried that this might allow for the disclosure of valuable trade secrets, the protection could be limited to employees who make their reports to a designated government office or hotline that can be trusted to securely handle the information it receives. 

In addition to specifying the kinds of disclosures that are protected, a whistleblower law needs to provide clarity on precisely who is entitled to receive protection for blowing the whistle. Some whistleblower laws cover only “employees,” and define that term to exclude, e.g., independent contractors and volunteers. This kind of restriction would be inadvisable in the AI governance context. Numerous proposals have been made for various kinds of independent, and perhaps voluntary, third party testing and auditing of frontier AI systems. The companies and individuals conducting those tests and audits would be well-placed to become aware of new risks from frontier models.  Protecting the ability of those individuals to securely and confidentially report risk-related information to the government should be a priority. Here, the scope of California’s SB 53 seems close to ideal—it covers contractors, subcontractors, and unpaid advisors who work for a business as well as ordinary employees. 

c. Prohibiting contractual waivers of whistleblower protections 

The ideal AI whistleblower law would provide that its protections could not be waived by an NDA or any similar contract or policy. Without such a provision, the effectiveness of any whistleblower law could be blunted by companies requiring employees to sign a relatively standard broad NDA, even if the company didn’t specifically intend to restrict whistleblowing. While a court might hold that such an NDA was unenforceable under common law principles, the uncertainty surrounding how a given court might view a given set of circumstances means that even an unenforceable NDA might have a significant impact on the likelihood of whistleblowers coming forward.

It is possible to pass laws directly prohibiting contracts that discourage whistleblowing—the SEC, for example, often brings charges under the Securities Exchange Act against companies that require employees to sign broad nondisclosure agreements if those agreements don’t include an exception allowing whistleblowers to report information to the SEC. A less controversial approach might be to declare such agreements unenforceable; this, for example, is what 18 U.S.C. § 1514A (another federal law relating to whistleblowing in the securities context) does. California’s SB 53 and some other state whistleblower laws do something similar, but with one critical difference—they prohibit employers from adopting “any rule, regulation, or policy” preventing whistleblowing, without specifically mentioning contracts. The language in SB 53, while helpful, likely wouldn’t cover individualized nondisclosure agreements that aren’t the result of a broader company policy.[ref 9] In future state or federal legislation, it would be better to use language more like the language in 18 U.S.C. §  1514A, which states that “The rights and remedies provided for in this section may not be waived by any agreement, policy form, or condition of employment, including by a predispute arbitration agreement.”

Conclusion

Whistleblower protections for employees at frontier AI companies are a fairly hot topic these days. Numerous state bills have been introduced, and there’s a good chance that federal legislation will follow. The idea seems to have almost as much currency with libertarian-minded private governance advocates as it does with European regulators: California SB 813, the recent proposal for establishing a system of “semiprivate standards organizations” to privately regulate AI systems, would require would-be regulators to attest to their plan for “implementation and enforcement of whistleblower protections.” 

There’s reasonably widespread agreement, in other words, that it’s time to enact protections for AI whistleblowers. This being the case, it makes sense for policymakers and commentators who take an interest in this sort of thing to develop some informed opinions about what whistleblower laws are supposed to do and how best to design a law that does those things. 

Our view is that AI whistleblower laws are essentially an information-gathering authority—a low-cost, innovation-friendly way to tweak the incentives of people with access to important information so that they’re more likely to make disclosures that benefit the public interest. It’s plausible that, from time to time, individual workers at the companies developing transformative AI systems will become aware of important nonpublic information about risks posed by those systems. Removing obstacles to disclosing that information will, on the margin, encourage additional disclosures and benefit the public. But passing “an AI whistleblower law” isn’t enough. Anyone trying to design such a law will face a number of important decisions about how to structure the offered protections and how to balance companies’ legitimate interest in safeguarding confidential information against the public’s interest in transparency. There are better and worse ways of proceeding, in other words; the idea behind this post was to shed a bit of light on which are which.

LawAI’s comments on the Draft Report of the Joint California Policy Working Group on AI Frontier Models

At Governor Gavin Newsom’s request, a joint working group released a draft report on March 18, 2025 setting out a framework for frontier AI policy in California. Several of the staff at the Institute for Law & AI submitted comments on the draft report as it relates to their existing research. Read their comments below:

These comments were submitted to the Working Group as feedback on April 8, 2025. The opinions expressed in these comments are those of the authors and do not reflect the views of the Institute for Law & AI.

Liability and Insurance Comments

by Gabriel Weil and Mackenzie Arnold

Key Takeaways

  1. Insurance is a complement to, not a replacement for, clear tort liability.
  2. Correctly scoped, liability is compatible with innovation and well-suited to conditions of uncertainty.
  3. Safe harbors that limit background tort liability are a risky bet when we are uncertain about the magnitude of AI risks and have yet to identify robust mitigations.

Whistleblower Protections Comments

by Charlie Bullock and Mackenzie Arnold

Key Takeaways

  1. Whistleblowers should be protected for disclosing information about risks to public safety, even if no law, regulation, or company policy is violated.
  2. California’s existing whistleblower law already protects disclosures about companies that break the law; subsequent legislation should focus on other improvements.
  3. Establishing a clear reporting process or hotline will enhance the effectiveness of whistleblower protections and ensure that reports are put to good use.

Scoping and Definitions Comments

by Mackenzie Arnold and Sarah Bernardo

Key Takeaways

  1. Ensuring that a capable entity regularly updates what models are covered by a policy is a critical design consideration that future-proofs policies.
  2. Promising techniques to support updating include legislative purpose clauses, periodic reviews, designating a capable updater, and providing that updater with the information and expertise needed to do the job.
  3. Compute thresholds are an effective tool to right-size AI policy, but they should be paired with other tools like carve-outs, tiered requirements, multiple definitions, and exemptions to be most effective.
  4. Compute thresholds are an excellent initial filter to determine what models are in scope, and capabilities evaluations are a particularly promising complement.
  5. In choosing a definition of covered models, policymakers should consider how well the definitional elements are risk-tracking, resilient to circumvention, clear, and flexible—in addition to other factors discussed in the Report.

Draft Report of the Joint California Policy Working Group on AI Frontier Models – scoping and definitions comments

These comments on the Draft Report of the Joint California Policy Working Group on AI Frontier Models were submitted to the Working Group as feedback on April 8, 2025. The opinions expressed in these comments are those of the authors and do not reflect the views of the Institute for Law & AI. 

Commendations

1. The Report correctly identifies that AI models and their risks vary significantly and thus merit different policies with different inclusion criteria.

Not all AI policies are made alike. Those that target algorithmic discrimination, for example, concern a meaningfully different subset of systems, actors, and tradeoffs than a policy that targets cybersecurity threats. What’s more, the market forces affecting these different policies vary considerably. For example, one might be far more concerned about limiting innovation in a policy context where many small startups are attempting to integrate AI into novel, high-liability-risk contexts (e.g., healthcare) and less concerned in contexts that involve a few large actors receiving large, stable investments, where the rate of tort litigation is much lower absent grievous harms (e.g., frontier model development). That’s all to say: It makes sense to foreground the need to scope AI policies according to the unique issue at hand.

2. We agree that at least some policies should squarely address foundation models as a distinct category.

Foundation models, in particular those that present the most advanced or novel capabilities in critical domains, present unique challenges that merit separate treatment. These differences emerge from the unique characteristics of the models themselves, not their creators (who vary considerably) or their users. And the potential benefits and risks that foundation models present cut across clean sectoral categories.

3. We agree that thresholds are a useful and necessary tool for tailoring laws and regulations (even if they are imperfect).

Thresholds are easy targets for criticism. After all, there is something inherently arbitrary about setting a speed limit at 65 miles per hour rather than 66. Characteristics are more often continuous than binary, so typically there isn’t a clear category shift after you cross over some talismanic number. But this issue isn’t unique to AI policy, and in every other context, government goes on nonetheless. As the Report notes, policy should be proportional in its effects and appropriately narrow in its application. Thresholds help make that possible.

4. The Report correctly acknowledges the need to update thresholds and definitional criteria over time.

We agree that specific threshold values and related definitional criteria will likely need to be updated to keep up with technological advances. Discrete, quantitative thresholds are particularly at risk of becoming obsolete. For instance, thresholds based on training compute may become obsolete due to a variety of AI developments, including improvements in compute and algorithmic efficiency, techniques such as distillation, and/or the growing impact of inference scaling. Given the competing truths that setting some threshold is necessary and that any threshold will inevitably become obsolete, ensuring that definitions can be quickly, regularly, and easily updated should be a core design consideration. 

5. We agree that, at present, compute thresholds (combined with other metrics and/or thresholds) are preferable to developer-level thresholds.

Ultimately, the goal of a threshold is to set a clear, measurable, and verifiable bar that correlates with the risk or benefit the policy attempts to address. In this case, a compute threshold best satisfies those criteria—even if it is imperfect. For more discussion, see Training Compute Thresholds: Features and Functions in AI Regulation and The Role of Compute Thresholds for AI Governance

Recommendations

1. The Report should further emphasize the centrality of updating thresholds and definitional criteria.

Updating is perhaps the most important element of an AI policy. Without it, the entire law may in short time cease to cover the conduct or systems policymakers aimed to target. We should expect this to happen by default. The error may be one of overinclusion—for example, large systems may present few or manageable risks even after a compute threshold is crossed. After some time, we will be confident that these systems do not merit special government attention and will want to remove obligations that attach to them. The error may be one of underinclusion—for example, improvements in compute or algorithmic efficiency, techniques such as distillation, and/or the growing impact of inference scaling may mean that models below the threshold merit inclusion. The error may be in both directions—a truly unfortunate, but entirely plausible, result. Either way, updating will be necessary for policy to remain effective.

We raise this point because without key champions, updating mechanisms will likely be left out of California AI legislation—leading to predictable policy failures. While updating has been incorporated into many laws and regulations, it was notably absent from the final draft of SB 1047 (save for an adjustment for inflation). A similar result cannot befall future bills if they are to remain effective long-term. A clear statement by the authors of the Report would go a long way toward making updating feasible in future legislation.

Recommendation: The Report should clearly state that updating is necessary for effective AI policy and explain why policy is likely to become ineffective if updating is not included. It should further point to best practices (discussed below) to address common concerns about updating.

2. The Report should highlight key barriers to effective updating and tools to manage those barriers.

Three major barriers stand in the way of effective updating. First is the concern that updating may lead to large or unpredictable changes, creating uncertainty or surprise and making it more difficult for companies to engage in long-term planning or fulfill their compliance obligations. Second, some (understandably) worry that overly broad grants of discretion to agencies to update the scope of regulation will lead to future overreach, extending powers to contexts far beyond what was originally intended by legislators. Third, state agencies may lack sufficient capacity or knowledge to effectively update definitions.

The good news: These concerns can be addressed. Establishing predictable periodic reviews, requiring specific procedures for updates, and ensuring consistent timelines can limit uncertainty. Designating a competent updater and supplying them with the resources, data, and expert consultation they need can address concerns about agency competency. And constraining the option space of future updates can limit both surprise and the risk of overreach. When legislators are worried about agency overreach, their concern is typically that the law will be altered to extend to an unexpected context far beyond what the original drafters intended—for example, using a law focused on extreme risks to regulate mundane online chatbots or in a way that increases the number of regulated models by several orders of magnitude. To combat this worry, legislators can include a purpose clause that directly states the intended scope of the law and the boundaries of future updates. For example, a purpose clause could specify that future updates extend “only to those models that represent the most advanced models to date in at least one domain or materially and substantially increase the risk of harm X.” Purpose clauses can also come in the imperative or negative. For example, “in updating the definition in Section X, Regulator Y should aim to adjust the scope of coverage to exclude models that Regulator Y confidently believes pose little or no material risk to public health and safety.”

Recommendation: The Report should highlight the need to address the risks of uncertainty, agency overreach, and insufficient agency capacity when updating the scope of legislation. It should further highlight useful techniques to manage these issues, namely, (a) including purpose clauses or limitations in the relevant definitions, (b) specifying the data, criteria, and public input to be considered in updating definitions, (c) establishing periodic reviews with predictable frequencies, specific procedures, and consistent timelines, (d) designating a competent updater that has adequate access to expertise in making their determinations, (e) ensuring sufficient capacity to carry out periodic reviews and quickly make updates outside of such reviews when necessary, and (f) providing adequate notice and opportunity for input. 

3. The Report should highlight other tools beyond thresholds to narrow the scope of regulations and laws—namely, carve-outs, tiered requirements, multiple definitions, and exemption processes.

Thresholds are not the only option for narrowing the scope of a law or regulation, and highlighting other options increases the odds that a consensus will emerge. Too often, debates around the scope of AI policy get caught on whether a certain threshold is overly burdensome for a particular class of actor. But adjusting the threshold itself is often not the most effective way to limit these spillover effects. The tools below are strong complements to the recommendations currently made in the Report.

By carve-outs, we mean a full statutory exclusion from coverage (at least for purposes of these comments). Common carve-outs to consider include:

This is not to say that these categories should always be exempt, but rather that making explicit carve-outs for these categories will often ease tensions over specific thresholds. In particular, it is worth noting that while current open-source systems are clearly net-positive according to any reasonable cost-benefit calculus, future advances could plausibly merit some regulatory oversight. For this reason, any carve-out for open-source systems should be capable of being updated if and when that balance changes, perhaps with a heightened evidentiary burden for beginning to include such systems. For example, open-source systems might be generally exempt, but a restriction may be imposed upon a showing that the open-source systems materially increase marginal risk in a specific category, that other less onerous restrictions do not adequately limit this risk, and that the restriction is narrowly tailored. 

Related, but less binary, is the use of tiered requirements that impose only a subset of requirements or weaker requirements on these favored models or entities, such as, requiring certain reporting requirements of smaller entities while not requiring them to perform the same evaluations. For this reason, more legislation should likely include multiple or separate definitions of covered models to enable a more nimble, select-only-those-that-apply approach to requirements.

Another option is to create exemption processes whereby entities can be relieved of their obligations if certain criteria are met. For example, a model might be exempt from certain requirements if it has not, after months of deployment, materially contributed to a specific risk category or if the model has fallen out of use. Unlike the former two options, these exemption processes can be tailored to case-by-case fact patterns and occur long after the legislative or regulatory process. They may also better handle harder-to-pin-down factors like whether a model creates exceptional risk. These exemption processes can vary in a few key respects, namely:

Recommendation: The Report already mentions that exempting small businesses from regulations will sometimes be desirable. It should build on this suggestion by emphasizing the utility of carve-outs, tiered requirements, multiple definitions, and exemption processes (in addition to thresholds) to further refine the category of regulated models. It should also outline some of the common carve-out categories (noting the value of maintaining option value by ensuring that carve-outs for open-source systems are revised and updated if the cost-benefit balance changes in the future) as well as key considerations in creating exemption processes. 

4. We recommend that the Report elaborate on the approach of combining different types of thresholds by discussing the complementary pairing of compute and capabilities thresholds.

It is important to provide additional detail about other metrics that could be combined with compute thresholds because this approach is promising and one of the most actionable items in the Report. We recommend capabilities thresholds as a complement to compute thresholds in order to leverage the advantages of compute that make it an excellent initial filter, while making up for its limitations with evaluations of capabilities, which are better proxies for risk and more future-proof. Other metrics could also be paired with compute thresholds in order to more closely track the desired policy outcome, such as risk thresholds or impact-level properties; however, they have practical issues, as discussed in the Report.

Recommendation: The Report should expand on its suggestion that compute thresholds be combined with other metrics and thresholds by noting that capabilities evaluations may be a particularly promising complement to compute thresholds, as they more closely correspond to risk and are more adaptable to future developments and deployment in different contexts. Other metrics could also be paired with compute thresholds in order to more closely track the desired policy outcome, such as risk evaluations or impact-level properties.

5. The Report should note additional definitional considerations in the list in Section 5.1—namely, risk-tracking, resilience to circumvention, clarity, and flexibility.

The Report correctly highlights three considerations that influence threshold design: determination time, measurability, and external verifiability. 

Recommendation: We recommend that the Report note four additional definitional considerations, namely:

For more discussion, see Training Compute Thresholds: Features and Functions in AI Regulation and The Role of Compute Thresholds for AI Governance.

Draft Report of the Joint California Policy Working Group on AI Frontier Models – whistleblower protections comments

These comments on the Draft Report of the Joint California Policy Working Group on AI Frontier Models were submitted to the Working Group as feedback on April 8, 2025. The opinions expressed in these comments are those of the authors and do not reflect the views of the Institute for Law & AI.

We applaud the Working Group’s decision to include a section on whistleblower protections. Whistleblower protections are light-touch, innovation-friendly interventions that protect employees who act in good faith, enable effective law enforcement, and facilitate government access to vital information about risks. Below, we make a few recommendations for changes that would help the Report more accurately describe the current state of whistleblower protections and more effectively inform California policy going forward. 

1. Whistleblowers should be protected for disclosing risks to public safety even if no company policy is violated 

The Draft Report correctly identifies the importance of protecting whistleblowers who disclose risks to public safety that don’t involve violations of existing law. However, the Draft Report seems to suggest that this protection should be limited to circumstances where risky conduct by a company  “violate[s] company policies.” This would be a highly unusual limitation, and we strongly advise against including language that could be interpreted to recommend it. A whistleblower law that only applied to disclosures relating to violations of company policies would perversely discourage companies from adopting strong internal policies (such as responsible scaling policies). This would blunt the effectiveness of whistleblower protections and perhaps lead to companies engaging in riskier conduct overall.

To avoid that undesirable result, existing whistleblower laws that protect disclosures regarding risks in the absence of direct law-breaking focus on the seriousness and likelihood of the risk rather than on whether a company policy has been violated. See, for example: 5 U.S.C. § 2302(b)(8) (whistleblower must “reasonably believe” that their disclosure is evidence of a “substantial and specific danger to public health or safety”); 49 U.S.C. § 20109 (whistleblower must “report[], in good faith, a hazardous safety or security condition”); 740 ILCS 174/15 (Illinois) (whistleblower must have a “good faith belief” that disclosure relates to activity that “poses a substantial and specific danger to employees, public health, or safety.”). Many items of proposed AI whistleblower legislation in various states also recognize the importance of protecting this kind of reporting. See, for example: California SB 53 (2025–2026) (protecting disclosures by AI employees related to “critical risks”); Illinois HB 3506 (2025–2026) (similar); Colorado HB25-1212 (protecting disclosures by AI employees who have “reasonable cause to believe” the disclosure relates to activities that “pose a substantial risk to public safety or security, even if the developer is not out of compliance with any law”).

We recommend that the report align its recommendation with these more common, existing whistleblower protections, by (a) either omitting the language regarding violations of internal company policy or qualifying it to clarify that the Report is not recommending that such violations be used as a requirement for whistleblower protections to apply; and (b) explicitly referencing common language used to describe the type of disclosures that are protected even in the absence of lawbreaking.

2. The report’s overview of existing law should discuss California’s existing protections

The report’s overview of existing whistleblower protections makes no mention of California’s whistleblower protection law, California Labor Code § 1102.5. That law protects both public and private employees in California from retaliation for reporting violations of any state, federal, or local law or regulation to a government agency or internally within a company. It also prohibits employers from adopting any internal policies to prevent employees from whistleblowing. 

This is critical context for understanding the current state of California whistleblower protections and the gaps that remain. The fact that § 1102.5 already exists and applies to California employees of AI companies means that additional laws specifically protecting AI employees from retaliation for reporting law violations would likely be redundant unless they added something new—e.g., protection for good faith disclosures relating to “substantial and specific dangers to public health or safety.”

This information could be inserted into the subsection on “applicability of existing whistleblower protections.”

3. The report should highlight the importance of establishing a reporting process

Protecting good-faith whistleblowers from retaliation is only one lever to ensure that governments and the public are adequately informed of risks. Perhaps even more important is ensuring that the government of California appropriately handles that information once it is received. One promising way to facilitate the secure handling of sensitive disclosures is to create a designated government hotline or office for AI whistleblower disclosures. 

This approach benefits all stakeholders:

The report already touches briefly on the desirability of “ensuring clarity on the process for whistleblowers to safely report information,” but a more specific and detailed recommendation would make this section of the Report more actionable. Precisely because of our uncertainty about the risks posed by future AI systems, there is great option value in building the government’s capacity to quickly, competently, and securely react to new information received through whistleblowing. By default, we might expect that no clear chain of command will exist for processing this new information, sharing it securely with key decision makers, or operationalizing it to improve decision making. This increases coordination costs and may ultimately result in critical information being underutilized or ignored.

Draft Report of the Joint California Policy Working Group on AI Frontier Models – liability and insurance comments

These comments on the Draft Report of the Joint California Policy Working Group on AI Frontier Models were submitted to the Working Group as feedback on April 8, 2025. Any opinions expressed in these comments are those of the authors and do not reflect the views of the Institute for Law & AI.

Comment 1: The draft report correctly points to insurance as a potentially useful policy lever. But it incorrectly suggests that insurance alone (without liability) will cause companies to internalize their costs. Insurance likely will not work without liability, and the report should acknowledge this.

Insurance could advance several goals at the center of this report. Insurance creates private market incentives to more accurately measure and predict risk, as well as to identify and adopt effective safety measures. It can also bolster AI companies’ ability to compensate victims for large harms caused by their systems. The value of insurance is potentially limited by the difficulty of modeling at least some risks in this context, but to the extent that the report’s authors are enthusiastic about insurance, it is worth highlighting that these benefits depend on the underlying prospect of liability. If AI companies are not–and do not expect to be–held liable when their systems harm their customers or third parties, they would have no reason to purchase insurance to cover those harms and inadequate incentives to mitigate those risks. 

Passing state laws that require insurance doesn’t solve this problem either: if companies aren’t held liable for harms they generate (because of gaps in existing law, newly legislated safe harbors, federal preemption, or simple underenforcement), insurance plans would cease to accurately track risk.

In section 1.3, the draft report suggests efforts to:

reconstitute market incentives for companies to internalize societal externalities (e.g., incentivizing insurance may mold market forces to better prioritize public safety).” 

We propose amending this language to read:

reconstitute market incentives for companies to internalize societal externalities (e.g., clear liability rules, especially for harms to non-users, combined with incentives to acquire liability insurance may mold market forces to better prioritize public safety).

Comment 2: Liability can be a cost-effective tool for mitigating risk without discouraging innovation, especially under conditions of uncertainty. And many of the report’s transparency suggestions would improve the efficiency of liability and private contracting. The report should highlight this.

Overall, the report provides minimal discussion of liability as a governance tool. To the extent it does, the tone (perhaps) suggests skepticism of liability-based governance (“In reality, when governance mechanisms are unclear or underdeveloped, oversight often defaults largely to the courts, which apply existing legal frameworks—such as tort law…”). 

But liability is a promising tool, even more so given the considerable uncertainty surrounding future AI risks–a point that the authors correctly emphasize is the core challenge of AI policy. 

Liability has several key advantages under conditions of uncertainty. Liability is:

Ex ante regulations require companies to pay their costs upfront. Where those costs are large, they depend on a strong social consensus about the magnitude of the risks that they are designed to mitigate. Prescriptive rules and approval regulation regimes, the most common forms of ex ante regulation, also depend on policymakers’ ability to identify specific precautionary measures early on, which is challenging in a nascent field like AI, where best practices are still being developed and considerable uncertainty exists about the severity and nature of potential risks. 

Liability, by contrast, scales automatically with the risk and shifts decision-making regarding what mitigation measures to implement to the AI companies, who are often best positioned to identify cost-effective risk mitigation strategies. 

Concerns about excessive litigation are reasonable but can be mitigated by allowing wide latitude for contracts to waive and allocate liability between model developers, users, and various intermediaries–with the notable exception of third-party harm, where the absence of contractual privity does not allow for efficient contracting. In fact, allocation of responsibility by contract goes hand-in-hand with the transparency and information-sharing recommendations highlighted in the report–full information allows for efficient contracting. Risk of excessive litigation also varies by context, being least worrisome where the trigger for liability is clear and rare (as is the case with liability for extreme risks) and most worrisome where the trigger is more common and occurs in a context where injuries are common even when the standard of care is followed (e.g., in the context of healthcare). There may be a case for limiting liability in contexts where false positives are likely to abound, but liability is a promising, innovation-compatible tool in some of the contexts at the center of this report.. 

A strong summary of the potential use and limitations of liability for AI risk would note that:

Comment 3: Creating safe harbors that protect AI companies from liability is a risky strategy, given the uncertainty about both the magnitude of risks posed by AI and the effectiveness of various risk mitigation strategies. The report should note this.

In recent months, several commentators have called for preemption of state tort law or the creation of safe harbors in return for compliance with some of the suggestions made in this report. While we believe that the policy tools outlined in the report are important, it would be a valuable clarification for the report to state that these requirements alone do not merit the removal of background tort law protections.

Under existing negligence law, companies can, of course, argue that their compliance with many of the best practices outlined in this report  is evidence of reasonable care. But, as outlined above, tort law creates additional and necessary incentives that cannot be provided through reporting and evaluation alone. 

As we see it, tort law is compatible with–not at odds with or replaceable by–the evidence-generating, information-rich suggestions of this report. In an ecosystem with greater transparency and better evaluations, parties will be able to even more efficiently distribute liability via contract, enhancing its benefits and more precisely distributing its costs to those best positioned to address them.  

It also merits noting that creating safe harbors based on compliance with relatively light-touch measures like transparency and third-party verification would be an unusual step historically, and would greatly reduce AI companies’ incentives to take risk-mitigation measures that are not expressly required. 

Because tort law is enhanced by the suggested policies of this report and addresses the key dilemma (uncertainty) that this report seeks to address, we recommend that the report clarify the risk posed by broad, general liability safe harbors.

Comment 4: The lesson of climate governance is that transparency alone is inadequate to produce good outcomes. When confronting social externalities, policies that directly compel the responsible parties to internalize the costs and risks that they generate are often the most efficient solutions. In the climate context, the best way to do this is with an ex ante carbon price. Given the structural features of AI risk, ex post liability plays an analogous role in AI governance.

Section 2.4 references lessons from climate change governance. “The case of fossil fuel companies offers key lessons: Third-party risk assessment could have realigned incentives to reward energy companies innovating responsibly while simultaneously protecting consumers.” In our view, this overstates the potential of transparency measures like third-party risk assessment alone and undervalues policies that compel fossil fuel companies and their consumers to internalize the costs generated by fossil fuel combustion. After all, the science on climate change has been reasonably clear for decades now and that alone has been far from sufficient to align the incentives of fossil fuel companies with social welfare. The core policy challenge of climate change is that fossil fuel combustion generates global negative externalities in the form of heat-trapping effects of greenhouse gas emissions. Absent policies, like carbon pricing, to compel fossil fuel companies and their consumers to internalize the costs generated by fossil fuel combustion, mere transparency about climate impacts is an inadequate response. 

Third-party risk assessments and other transparency measures alone are similarly unlikely to be sufficient in the AI risk context. Transparency and third-party evaluation are best thought of as tools that help prepare us for further action (be it through generating better quality evidence on which to regulate, enabling more efficient contracting to allocate risk, or enabling efficient litigation once harms occur). But without that further action, they forego much of their potential value. Aligning the incentives of AI companies will require holding them financially accountable for the risks that they generate, and Liability is the best accountability tool we have for AI risk and plays a structurally similar role to carbon pricing for climate risk mitigation.

We propose amending the report language to read, “The case of fossil fuel companies offers key lessons: Third-party risk assessment could have helped build the case for policies, like carbon pricing, that would have realigned incentives to reward energy companies innovating responsibly while simultaneously protecting consumers.”

Section 2.4 further states, “The costs of action to reduce greenhouse gas emissions, meanwhile, were estimated [by the Stern Review] at only 1% of global GDP each year. This is a useful lesson for AI policy: Leveraging evidence-based projections, even under uncertainty, can reduce long-term economic and security costs.” 

But this example only further evidences the fact that cost internalization mechanisms, in addition to transparency mechanisms, are key to risk reduction. The Stern Review’s cost estimates were based on the assumption that governments would implement the most cost-effective policies, like economy-wide carbon pricing, to reduce greenhouse gas emissions. Actual climate policies implemented around the world have tended to be substantially less cost-effective. This is not because carbon pricing is more costly or less effective than Stern assumes but because policymakers have been reluctant to implement it aggressively, despite broad global acceptance of the basic science of climate change. 

This lesson is highly relevant to AI governance inasmuch as the closest analog to carbon pricing is liability, which directly compels AI companies to internalize the risks generated by their systems, just as a carbon price compels fossil fuel companies to internalize the costs associated with their incremental contribution to climate change. An AI risk tax is impractical since it is not feasible to measure AI risk ex ante. But, unlike with climate change, it will likely generally be feasible to attribute AI harms to particular AI systems and to hold the companies that trained and deployed them accountable. 

Supporting documents

For more on the analogy between AI liability and carbon pricing and an elaboration of a proposed liability framework that accounts for uninsurable risks, see Gabriel Weil, Tort Law as a Tool for Mitigating Catastrophic Risk from Artificial Intelligence, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4694006.

This proposal is also summarized in this magazine article: Gabriel Weil, Your AI Breaks It? You Buy It: AI developers should pay for what they screw up, Noema Mag (2024) 

For more on the case for prioritizing liability as an AI governance tool, see Gabriel Weil, Instrument Choice in AI Governance: Liability and its Alternatives, Google Docs, https://docs.google.com/document/d/1ivtgfLDQqG05U2vM1211wNtTDxNCjZr1-2NWf6tT5cU/edit?tab=t.0

The core arguments are also laid out in this Lawfare piece: Gabriel Weil, Tort Law Should Be the Centerpiece of AI Governance, Lawfare (2024).

Balancing safety and privacy: regulatory models for AI misuse

Since consumer AI tools have exploded in popularity, fears of AI-based threats to security have moved from sci-fi to reality. The FBI warns that criminals are already using AI to hack financial networks, and OpenAI disrupted an Iranian government disinformation operation last year. But risks could rapidly escalate beyond theft and propaganda to truly catastrophic threats—from designing deadly viruses to hacking into critical infrastructure. Such threats pose a legitimate threat not only to AI users but to national security itself.

In response, proposals have emerged for mandatory monitoring and reporting mechanisms to prevent AI misuse. These proposals demand careful scrutiny. The Supreme Court typically protects reasonable expectations of privacy under the Fourth Amendment, and people may reasonably expect to use these new tools without fear of government surveillance.

Yet governments should not shy away from carefully designed oversight. AI labs likely already conduct some legal monitoring of their consenting users. In addition, U.S. law has several analogous frameworks—notably the Bank Secrecy Act and laws combating child sexual abuse material (CSAM)—that require private companies to record potentially illicit activity and/or make reports to authorities. These precedents show how reasonable monitoring regulation can help prevent crime while respecting privacy rights. 

AI Misuse Risks

Artificial intelligence systems present various categories of potential catastrophic risks, ranging from unintended accidents to loss of human control over increasingly powerful systems. But we need not imagine a “Skynet” scenario to worry about catastrophic AI. Another kind of risk is simple misuse: bad actors who intentionally use AI to do dangerous and illegal things. This intentional misuse raises particularly salient privacy concerns, as mitigating it requires monitoring individual user behavior rather than just overseeing AI systems or their developers.

While AI might enable various forms of criminal activity, from copyright infringement to fraud, two categories of catastrophic misuse merit particularly careful consideration due to their potential for widespread devastation. First, AI could dramatically lower barriers to bioterrorism by helping malicious actors design and create deadly pathogens. Current AI models can already provide detailed scientific knowledge and laboratory protocols that could potentially be exploited for biological weapons development. Researchers have shown that current language models can already directly instruct laboratory robots to carry out experiments, suggesting that as AI advances, the capability to create deadly pathogens could become increasingly available to potential bad actors.

Second, AI systems may enable unprecedented cyber warfare capabilities that could threaten critical infrastructure and national security. A recent FBI threat assessment highlights how AI could enable sophisticated cyber-physical attacks on critical infrastructure, from manipulating industrial control systems to compromising autonomous vehicle safety systems. For instance, in 2017, the “Triton” malware attack targeted petrochemical plants in the Middle East, attempting to disable critical safety mechanisms. As capabilities improve, we may see fully autonomous AI systems conducting cyberattacks with minimal human oversight. 

Government-mandated monitoring may be justified for AI risk, but it should not be taken lightly. Focusing specifically on the most serious threats helps maintain an appropriate balance between security and privacy. 

Current Safety Measures

AI developers use various methods to prevent misuse, including “fine-tuning” models and filtering suspicious prompts. However, researchers have demonstrated the ability to “jailbreak” models and bypass these built-in restrictions. This capability suggests the need for a system of monitoring that allows developers to respond swiftly to initial cases of misuse by limiting the ability of the bad actor to engage in further misuse. AI providers may scan user interactions for patterns indicative of misuse attempts, flag high-risk users, and take actions ranging from warnings to imposing access restrictions or account bans.

These private monitoring efforts operate within a statutory framework that generally allows companies enough flexibility to monitor their services when necessary. The Electronic Communications Privacy Act (ECPA) restricts companies from accessing users’ communications, but contains several relevant exceptions—including consent, ordinary course of business activities, protecting the provider’s rights and property, and emergency disclosures. Technology companies typically seek to establish consent through their privacy policies (though the legal sufficiency of this approach is often questioned), and also have significant latitude to monitor communications when necessary to make their services function. The ECPA also permits disclosure to law enforcement with proper legal process, and allows emergency disclosures when providers reasonably believe there is an immediate danger of death or serious physical injury. Thus, AI providers already have legal pathways to share critical threat information with authorities, but are not under clear obligations to do so.

Incident Reporting

The shortcoming of purely internal monitoring is that malicious actors can migrate to other models after being banned or use multiple models to avoid detection. Accordingly, there is a need for centralized reporting systems to alert other developers of risks. Nonprofits like the Responsible AI Collaborative have begun to collect media reports of AI incidents, but documented real-world incidents likely represent only the tip of the iceberg. More importantly, focusing solely on successful attacks that caused harm misses the broader picture—AI providers regularly encounter suspicious behavior patterns, thwarted attempts at misuse, and users who may pose risks across multiple platforms. 

One potential model for addressing these limitations comes from requirements for reporting child sexual abuse material (CSAM). Under 18 U.S.C. § 2258A, electronic service providers must report detected CSAM to the National Center for Missing and Exploited Children, but face no obligation to proactively monitor for such material. Generally, § 2258A has survived Fourth Amendment challenges under the “private search doctrine,” which holds that the Fourth Amendment protects only against government searches, not private action. While private entity searches can be attributed to the government when there is sufficient government encouragement or participation, circuit courts have rejected Fourth Amendment challenges to § 2258A because it requires only reporting while explicitly disclaiming any monitoring requirement. As the Ninth Circuit explained in United States v. Rosenow, “mandated reporting is different than mandated searching,” because communications providers are “free to choose not to search their users’ data.”

California recently considered a similar approach to reporting in SB 1047, one provision of which would have required AI model developers to report “artificial intelligence safety incident[s]” to the state Attorney General within 72 hours of discovery. While ultimately vetoed, this reporting-focused approach offers several advantages: it would create a central clearinghouse for incident data, facilitate coordination across competing AI labs, without imposing any direct obligations for AI companies to monitor their users. 

A reporting-only mandate may paradoxically discourage active monitoring. If only required to report the problems they discover, some companies may choose not to look for them. This mirrors concerns raised during the “Crypto Wars” debates, where critics argued that encryption technology not only hindered third party access to communications but also prevented companies themselves from detecting and reporting illegal activity. For instance, while Meta reports CSAM found on public Facebook feeds, encryption is the default for channels like WhatsApp—meaning Meta can neither proactively detect CSAM on these channels nor assist law enforcement in investigating it after the fact.

AI companies might similarly attempt to move towards systems that make monitoring difficult. While most current commercial AI systems process inputs as unencrypted text, providers could shift toward local models running on users’ devices.  More ambitiously, some companies are working  “homomorphic” encryption techniques—which allow computation on encrypted data—for AI models. Short of retrieving the user’s device, these approaches would place AI interactions beyond providers’ reach.

Mandatory Recordkeeping

Given the limitations of a pure reporting mandate, policymakers might consider requiring AI providers to maintain certain records of user interactions, similar to bank recordkeeping requirements. The Bank Secrecy Act of 1970, passed to help law enforcement detect and prevent money laundering, provides an instructive precedent. The Act required banks both to maintain records of customer identities and transactions, and to report transactions above specified thresholds. The Act faced immediate constitutional challenges, but the Supreme Court upheld the Act in California Bankers Association v. Shultz (1974). The court highlighted several factors which overcame the plaintiff’s objections: the Act did not authorize direct government access without legal process; the requirements focused on specific categories of transactions rather than general surveillance; and there was a clear nexus between the recordkeeping and legitimate law enforcement goals.

This framework suggests how AI monitoring requirements might be structured: focusing on specific high-risk patterns rather than blanket surveillance, requiring proper legal process for government access, and maintaining clear links between the harm being protected against (catastrophic misuse) and the kinds of records being kept. 

Unlike bank records, however, AI interactions have the potential to expose intimate thoughts and personal relationships. Recent Fourth Amendment doctrine suggests that this type of privacy may merit a higher level of scrutiny.

Fourth Amendment Considerations

The Supreme Court’s modern Fourth Amendment jurisprudence begins with Katz v. United States (1967), which established that government surveillance constitutes a “search” when it violates a “reasonable expectation of privacy.” Under the subsequent “third-party doctrine” developed in United States v. Miller (1976) and Smith v. Maryland (1979), individuals generally have no reasonable expectation of privacy in information voluntarily shared with third parties. This might suggest that AI interactions, like bank records, fall outside Fourth Amendment protection.

However, a growing body of federal case law has increasingly recognized heightened privacy interests in digital communications. In United States v. Warshak (2010), the Sixth Circuit found emails held by third parties deserve greater Fourth Amendment protection than traditional business records, due to their personal and confidential nature. Over the next decade, the Supreme Court similarly extended Fourth Amendment protections to GPS tracking, cell phone searches, and finally, cell-site location data. The latter decision, Carpenter v. United States (2018), was heralded as an “inflection point” in constitutional privacy law for its potentially broad application to various kinds of digital data, irrespective of who holds it. 

Though scholars debate Carpenter’s ultimate implications, early evidence suggests that courts are applying some version of the key factors that the opinion indicates are relevant for determining whether digital data deserves Fourth Amendment protection: (1) the “deeply revealing nature” of the information, (2) its “depth, breadth, and comprehensive reach,” and (3) whether its collection is “inescapable and automatic.”

All three factors raise concerns about AI monitoring. First, if Carpenter worried that location data could reveal personal associations in the aggregate, AI interactions can directly expose intimate thoughts and personal relationships. The popularity of AI companions designed to simulate close personal relationships are only an extreme version of the kind of intimacy someone might have with their chatbot. Second, AI’s reach is rapidly expanding – ChatGPT reached 100 million monthly active users within two months of launch, suggesting it may approach the scale of “400 million devices” that concerned the Carpenter Court. The third factor currently presents the weakest case for protection, as AI interactions still involve conscious queries rather than automatic collection. However, as AI becomes embedded into computer interfaces and standard work tools, using these systems may become as “indispensable to participation in modern society” as cell phones.

If courts do apply Carpenter to AI interactions, the unique privacy interests in AI communications may require stronger safeguards than those found sufficient for bank records in Shultz. This might not categorically prohibit recordkeeping requirements, but could mean that blanket monitoring regimes are constitutionally suspect. 

We can speculate as to what safeguards an AI monitoring regime may continue beyond those provided in the Bank Secrecy act. The system could limit itself to flagging user attempts to elicit specific kinds of dangerous behavior (like building biological weapons or hacking critical infrastructure), with automated systems scanning only for these pre-defined indicators of catastrophic risks. The mandate could prohibit bulk transmission of non-flagged conversations, and collected data could be subject to mandatory deletion after defined periods unless specifically preserved by warrant. Clear statutory prohibitions could restrict law enforcement using any collected data for purposes beyond preventing catastrophic harm, even if other incidental harms are discovered. Independent oversight boards could review monitoring patterns to prevent scope creep, and users whose data is improperly accessed or shared could be granted private rights of action.

While such extensive safeguards may prove unnecessary, they demonstrate how clear legal frameworks for AI monitoring could both protect against threats and enhance privacy compared to today’s ad-hoc approach. Technology companies often make decisions about user monitoring and government cooperation based on their individual interpretations of privacy policies and emergency disclosure provisions. Controversies around content moderation illustrate the tensions of informal government-industry cooperation: Meta CEO Mark Zuckerberg recently expressed regret over yielding to pressure from government officials to remove content during the COVID-19 crisis. In the privacy space, without clear legal boundaries, companies may err on the side of over-compliance with government requests and unnecessarily expose their users’ information. 

Conclusion

The AI era requires navigating two profound risks: unchecked AI misuse that could enable catastrophic harm, and the prospect of widespread government surveillance of our interactions with what may become the 21st century’s most transformative technology. As Justice Brandeis warned in his prescient dissent in Olmstead, “The greatest dangers to liberty lurk in insidious encroachment by men of zeal, well meaning but without understanding.” It is precisely because AI safety presents legitimate risks warranting serious countermeasures that we must be especially vigilant in preventing overreach. By developing frameworks that establish clear boundaries and robust safeguards, we can enable necessary oversight while preventing overzealous intrusions into privacy rights.

The role of compute thresholds for AI governance

I. Introduction

The idea of establishing a “compute threshold” and, more precisely, a “training compute threshold” has recently attracted significant attention from policymakers and commentators. In recent years, various scholars and AI labs have supported setting such a threshold,[ref 1] as have governments around the world. On October 30, 2023, President Biden’s Executive Order 14,110 on Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence introduced the first living example of a compute threshold,[ref 2] although it was one of many orders revoked by President Trump upon entering office.[ref 3] The European Parliament and the European Council adopted the Artificial Intelligence Act, on June 13, 2024, providing for the establishment of a compute threshold.[ref 4] On February 4, 2024, California State Senator Scott Wiener introduced Senate Bill 1047 that defined frontier AI models with a compute threshold.[ref 5] The bill was approved by the California legislature, but it was ultimately vetoed by the State’s Governor.[ref 6] China may be considering similar measures, as indicated by recent discussions in policy circles.[ref 7] While not perfect, compute thresholds are currently one of the best options available to identify potentially high-risk models and trigger further scrutiny. Yet, in spite of this, information about compute thresholds and their relevance from a policy and legal perspective remains dispersed.

This Article proceeds in two parts. Part I provides a technical overview of compute and how the amount of compute used in training corresponds to model performance and risk. It begins by explaining what compute is and the role compute plays in AI development and deployment. Compute refers to both computational infrastructure, the hardware necessary to develop and deploy an AI system, and the amount of computational power required to train a model, commonly measured in integer or floating-point operations. More compute is used to train notable models each year, and although the cost of compute has decreased, the amount of compute used for training has increased at a higher rate, causing training costs to increase dramatically.[ref 8] This increase in training compute has contributed to improvements in model performance and capabilities, described in part by scaling laws. As models are trained on more data, with more parameters and training compute, they grow more powerful and capable. As advances in AI continue, capabilities may emerge that pose potentially catastrophic risks if not mitigated.[ref 9]

Part II discusses why, in light of this risk, compute thresholds may be important to AI governance. Since training compute can serve as a proxy for the capabilities of AI models, a compute threshold can operate as a regulatory trigger, identifying what subset of models might possess more powerful and dangerous capabilities that warrant greater scrutiny, such as in the form of reporting and evaluations. Both the European Union AI Act and Executive Order 14,110 established compute thresholds for different purposes, and many more policy proposals rely on compute thresholds to ensure that the scope of covered models matches the nature or purpose of the policy. This Part provides an overview of policy proposals that expressly call for such a threshold, as well as proposals that could benefit from the addition of a compute threshold to clarify the scope of policies that refer broadly to “advanced systems” or “systems with dangerous capabilities.” It then describes how, even absent a formal compute threshold, courts and regulators might rely on training compute as a proxy for how much risk a given AI system poses, even under existing law. This Part concludes with the advantages and limitations of using compute thresholds as a regulatory trigger.

II. Compute and the Scaling Hypothesis

A. What Is “Compute”?

The term “compute” serves as an umbrella term, encompassing several meanings that depend on context.

Commonly, the term “compute” is used to refer to computational infrastructure, i.e., the hardware stacks necessary to develop and deploy AI systems.[ref 10] Many hardware elements are integrated circuits (also called chips or microchips), such as logic chips, which perform operations, and memory chips, which store the information on which logic devices perform calculations.[ref 11] Logic chips cover a spectrum of specialization, ranging from general-purpose central processing units (“CPUs”), through graphics processing units (“GPUs”) and field-programmable gate arrays (“FPGAs”), to application-specific integrated circuits (“ASICs”) customized for specific algorithms.[ref 12] Memory chips include dynamic random-access memory (“DRAM”), static random-access memory (“SRAM”), and NOT AND (“NAND”) flash memory used in many solid state drives (“SSDs”).[ref 13]

Additionally, the term “compute” is often used to refer to how much computational power is required to train a specific AI system. Whereas the computational performance of a chip refers to how quickly it can execute operations and thus generate results, solve problems, or perform specific tasks, such as processing and manipulating data or training an AI system, “compute” refers to the amount of computational power used by one or more chips to perform a task, such as training a model. Compute is commonly measured in integer operations or floating-point operations (“OP” or “FLOP”),[ref 14] expressing the number of operations that have been executed by one or more chips, while the computational performance of those chips is measured in operations per second (“OP/s” or “FLOP/s”). In this sense, the amount of computational power used is roughly analogous to the distance traveled by a car.[ref 15] Since large amounts of compute are used in modern computing, values are often reported in scientific notation such as 1e26 or 2e26, which refer to 1⋅1026 and 2⋅1026 respectively.

Compute is essential throughout the AI lifecycle. The AI lifecycle can be broken down into two phases: development and deployment.[ref 16] In the first phase, development, developers design the model by choosing an architecture, the structure of the network, and initial values for hyperparameters (i.e., parameters that control the learning process, such as number of layers and training rate).[ref 17] Enormous amounts of data, usually from publicly available sources, are processed and curated to produce high-quality datasets for training.[ref 18] The model then undergoes “pre-training,” in which the model is trained on a large and diverse dataset in order to build the general knowledge and features of the model, which are reflected in the weights and biases of the model.[ref 19] Alternatively, developers may use an existing pre-trained model, such as OpenAI’s GPT-4 (“Generative Pre-trained Transformer 4”). The term “foundation model” refers to models like these, which are trained on broad data and adaptable to many downstream tasks.[ref 20] Performance and capabilities improvements are then possible using methods such as fine-tuning on task-specific datasets, reinforcement learning from human feedback (“RLHF”), teaching the model to use tools, and instruction tuning.[ref 21] These enhancements are far less compute-intensive than pre-training, particularly for models trained on massive datasets.[ref 22]

As of this writing, there is no agreed-upon standard for measuring “training compute.” Estimates of “training compute” typically refer only to the amount of compute used during pre-training. More specifically, they refer to the amount of compute used during the final pre-training run, which contributes to the final machine learning model, and does not include any previous test runs or post-training enhancements, such as fine-tuning.[ref 23] There are exceptions: for instance, the EU AI Act considers the cumulative amount of compute used for training by including all the compute “used across the activities and methods that are intended to enhance the capabilities of the model prior to deployment, such as pre-training, synthetic data generation and fine-tuning.”[ref 24] California Senate Bill 1047 addressed post-training modifications generally and fine-tuning in particular, providing that a covered model fine-tuned with more than 3e25 OP or FLOP would be considered a distinct “covered model,” while one fine-tuned on less compute or subjected to unrelated post-training modifications would be considered a “covered model derivative.”[ref 25]

In the second phase, deployment, the model is made available to users and is used.[ref 26] Users provide input to the model, such as in the form of a prompt, and the model makes predictions from this input in a process known as “inference.”[ref 27] The amount of compute needed for a single inference request is far lower than what is required for a training run.[ref 28] However, for systems deployed at scale, the cumulative compute used for inference can surpass training compute by several orders of magnitude.[ref 29] Consider, for instance, a large language model (“LLM”). During training, a large amount of compute is required over a smaller time frame within a closed system, usually a supercomputer. Once the model is deployed, each text generation leverages its own copy of the trained model, which can be run on a separate compute infrastructure. The model may serve hundreds of millions of users, each generating unique content and using compute with each inference request. Over time, the cumulative compute usage for inference can surpass the total compute required for training.

There are various reasons to consider compute usage at different stages of the AI lifecycle, which is discussed in Section I.E. For clarity, this Article uses “training compute” for compute used during the final pre-training run and “inference compute” for compute used by the model during a single inference, measured in the number of operations (“OP” or “FLOP”). Figure 1 illustrates a simplified version of the language model compute lifecycle.


A diagram of a computer lifecycle

AI-generated content may be incorrect.
Figure 1: Simplified language model lifecycle

B. What Is Moore’s Law and Why Is It Relevant for AI?

In 1965, Gordon Moore forecasted that the number of transistors on an integrated circuit would double every year.[ref 30] Ten years later, Moore revised his initial forecast to a two-year doubling period.[ref 31] This pattern of exponential growth is now called “Moore’s Law.”[ref 32] Similar rates of growth have been observed in related metrics, notably including the increase in computational performance of supercomputers;[ref 33] as the number of transistors on a chip increases, so does computational performance (although other factors also play a role).[ref 34]

A corollary of Moore’s Law is that the cost of compute has fallen dramatically; a dollar can buy more FLOP every year.[ref 35] Greater access to compute, along with greater spending from 2010 onwards (i.e., the so-called deep learning era),[ref 36] has contributed to developers using ever more compute to train AI systems. Research has found that the compute used to train notable and frontier models has grown by 4–5x per year between 2010 and May 2024.[ref 37]


A graph with blue dots

AI-generated content may be incorrect.
Figure 2: Compute used to train notable AI systems from 1950 to 2023[ref 38]

However, the current rate of growth in training compute may not be sustainable. Scholars have cited the cost of training,[ref 39] a limited supply of AI chips,[ref 40] technical challenges with using that much hardware (such as managing the number of processors that must run in parallel to train larger models),[ref 41] and environmental impact[ref 42] as factors that could constrain the growth of training compute. Research in 2018 with data from OpenAI estimated that then-current trends of growth in training compute could be sustained for at most 3.5 to 10 years (2022 to 2028), depending on spending levels and how the cost of compute evolves over time.[ref 43] In 2022, that analysis was replicated with a more comprehensive dataset and suggested that this trend could be maintained for longer, for 8 to 18 years (2030 to 2040) depending on compute cost-performance improvements and specialized hardware improvements.[ref 44]

C. What Are “Scaling Laws” and What Do They Say About AI Models?

Scaling laws describe the functional (mathematical) relationship between the amount of training compute and the performance of the AI model.[ref 45] In this context, performance is a technical metric that quantifies “loss,” which is the amount of error in the model’s predictions. When loss is measured on a test or validation set that uses data not part of the training set, it reflects how well the model has generalized its learning from the training phase. The lower the loss, the more accurate and reliable the model is in making predictions on data it has not encountered during its training.[ref 46] As training compute increases, alongside increases in parameters and training data, so does model performance, meaning that greater training compute reduces the errors made.[ref 47] Increased training compute also corresponds to an increase in capabilities.[ref 48] Whereas performance refers to a technical metric, such as test loss, capabilities refer to the ability to complete concrete tasks and solve problems in the real world, including in commercial applications.[ref 49] Capabilities can also be assessed using practical and real-world tests, such as standardized academic or professional licensing exams, or with benchmarks developed for AI models. Common benchmarks include “Beyond the Imitation Game” (“BIG-Bench”), which comprises 204 diverse tasks that cover a variety of topics and languages,[ref 50] and the “Massive Multitask Language Understanding” benchmark (“MMLU”), a suite of multiple-choice questions covering 57 subjects.[ref 51] To evaluate the capabilities of Google’s PaLM 2 and OpenAI’s GPT-4, developers relied on BIG-Bench and MMLU as well as exams designed for humans, such as the SAT and AP exams.[ref 52]

Training compute has a relatively smooth and consistent relationship with technical metrics like training loss. Training compute also corresponds to real-world capabilities, but not in a smooth and predictable way. This is due in part to occasional surprising leaps, discussed in Section I.D, and subsequent enhancements such as fine-tuning, which can further increase capabilities using far less compute.[ref 53] Despite being unable to provide a full and accurate picture of a model’s final capabilities, training compute still provides a reasonable basis for estimating the base capabilities (and corresponding risk) of a foundation model. Figure 3 shows the relationship between an increase in training compute and dataset size, and performance on the MMLU benchmark.


A graph with green and blue dots

AI-generated content may be incorrect.
Figure 3: Relationship between increase in training compute and dataset size,
and performance on MMLU[ref 54]

In light of the correlation between training compute and performance, the “scaling hypothesis” states that scaling training compute will predictably continue to produce even more capable systems, and thus more compute is important for AI development.[ref 55] Some have taken this hypothesis further, proposing a “Bitter Lesson:” that “the only thing that matters in the long run is the leveraging of comput[e].”[ref 56] Since the emergence of the deep learning era, this hypothesis has been sustained by the increasing use of AI models in commercial applications, whose development and commercial success have been significantly driven by increases in training compute.[ref 57]

Two factors weigh against the scaling hypothesis. First, scaling laws describe more than just the performance improvements based on training compute; they describe the optimal ratio of the size of the dataset, the number of parameters, and the training compute budget.[ref 58] Thus, a lack of abundant or high-quality data could be a limiting factor. Researchers estimate that, if training datasets continue to grow at current rates, language models will fully utilize human-generated public text data between 2026 and 2032,[ref 59] while image data could be exhausted between 2030 and 2060.[ref 60] Specific tasks may be bottlenecked earlier by the scarcity of high-quality data sources.[ref 61] There are, however, several ways that data limitations might be delayed or avoided, such as synthetic data generation and using additional datasets that are not public or in different modalities.[ref 62]

Second, algorithmic innovation permits performance gains that would otherwise require prohibitively expensive amounts of compute.[ref 63] Research estimates that every 9 months, improved algorithms for image classification[ref 64] and LLMs[ref 65] contribute the equivalent of a doubling of training compute budgets. Algorithmic improvements include more efficient utilization of data[ref 66] and parameters, the development of improved training algorithms, or new architectures.[ref 67] Over time, the amount of training compute needed to achieve a given capability is reduced, and it may become more difficult to predict performance and capabilities on that basis (although scaling trends of new algorithms could be studied and perhaps predicted). The governance implications of this are multifold, including that increases in training compute may become less important for AI development and that many more actors will be able to access the capabilities previously restricted to a limited number of developers.[ref 68] Still, responsible frontier AI development may enable stakeholders to develop understanding, safety practices, and (if needed) defensive measures for the most advanced AI capabilities before these capabilities proliferate.

D. Are High-Compute Systems Dangerous?

Advances in AI could deliver immense opportunities and benefits across a wide range of sectors, from healthcare and drug discovery[ref 69] to public services.[ref 70] However, more capable models may come with greater risk, as improved capabilities could be used for harmful and dangerous ends. While the degree of risk posed by current AI models is a subject of debate,[ref 71] future models may pose catastrophic and existential risks as capabilities improve.[ref 72] Some of these risks are expected to be closely connected to the unexpected emergence of dangerous capabilities and the dual-use nature of AI models.

As discussed in Section I.C, increases in compute, data, and the number of parameters lead to predictable improvements in model performance (test loss) and general but somewhat less predictable improvements in capabilities (real-world benchmarks and tasks). However, scaling up these inputs to a model can also result in qualitative changes in capabilities in a phenomenon known as “emergence.”[ref 73] That is, a larger model might unexpectedly display emergent capabilities not present in smaller models, suddenly able to perform a task that smaller models could not.[ref 74] During the development of GPT-3, early models had close-to-zero performance on a benchmark for addition, subtraction, and multiplication. Arithmetic capabilities appeared to emerge suddenly in later models, with performance jumping substantially above random at 2·1022 FLOP and continuing to improve with scale.[ref 75] Similar jumps were observed at different thresholds, and for different models, on a variety of tasks.[ref 76]

Some have contested the concept of emergent capabilities, arguing that what appear to be emergent capabilities in large language models are explained by the use of discontinuous measures, rather than by sharp and unpredictable improvements or developments in model capabilities with scale.[ref 77] However, discontinuous measures are often meaningful, as when the correct answer or action matters more than how close the model gets to it. As Anderljung and others explain: “For autonomous vehicles, what matters is how often they cause a crash. For an AI model solving mathematics questions, what matters is whether it gets the answer exactly right or not.”[ref 78] Given the difficulties inherent in choosing an appropriate continuous measure and determining how it corresponds to the relevant discontinuous measure,[ref 79] it is likely that capabilities will continue to seemingly emerge.

Together with emerging capabilities come emerging risks. Like many other innovations, AI systems are dual-use by nature, with the potential to be used for both beneficial and harmful ends.[ref 80] Executive Order 14,110 recognized that some models may “pose a serious risk to security, national economic security, national public health or safety” by “substantially lowering the barrier of entry for non-experts to design, synthesize, acquire, or use chemical, biological, radiological, or nuclear weapons; enabling powerful offensive cyber operations . . . ; [or] permitting the evasion of human control or oversight through means of deception or obfuscation.”[ref 81]

Predictions and evaluations will likely adequately identify many capabilities before deployment, allowing developers to take appropriate precautions. However, systems trained at a greater scale may possess novel capabilities, or improved capabilities that surpass a critical threshold for risk, yet go undetected by evaluations.[ref 82] Some of these capabilities may appear to emerge only after post-training enhancements, such as fine-tuning or more effective prompting methods. A system may be capable of conducting offensive cyber operations, manipulating people in conversation, or providing actionable instructions on conducting acts of terrorism,[ref 83] and still be deployed without the developers fully comprehending unexpected and potentially harmful behaviors. Research has already detected unexpected behavior in current models. For instance, during the recent U.K. AI Safety Summit on November 1, 2023, Apollo Research showed that GPT-4 can take illegal actions like insider trading and then lie about its actions without being instructed to do so.[ref 84] Since the capabilities of future foundation models may be challenging to predict and evaluate, “emergence” has been described as “both the source of scientific excitement and anxiety about unanticipated consequences.”[ref 85]

Not all risks come from large models. Smaller models trained on data from certain domains, such as biology or chemistry, may pose significant risks if repurposed or misused.[ref 86] When MegaSyn, a generative molecule design tool used for drug discovery, was repurposed to find the most toxic molecules instead of the least toxic, it found tens of thousands of candidates in under six hours, including known biochemical agents and novel compounds predicted to be as or more deadly.[ref 87] The amount of compute used to train DeepMind’s AlphaFold, which predicts three-dimensional protein structures from the protein sequence, is minimal compared to frontier language models.[ref 88] While scaling laws can be observed in a variety of domains, the amount of compute required to train models in some domains may be so low that a compute threshold is not a practical restriction on capabilities.

Broad consensus is forming around the need to test, monitor, and restrict systems of concern.[ref 89] The role of compute thresholds, and whether they are used at all, depends on the nature of the risk and the purpose of the policy: does it target risks from emergent capabilities of frontier models,[ref 90] risks from models with more narrow but dangerous capabilities,[ref 91] or other risks from AI?

E. Does Compute Usage Outside of Training Influence Performance and Risk?

In light of the relationship between training compute and performance expressed by scaling laws, training compute is a common proxy for how capable and powerful AI models are and the risks that they pose.[ref 92] However, compute used outside of training can also influence performance, capabilities, and corresponding risk.

As discussed in Section I.A, training compute typically does not refer to all compute used during development, but is instead limited to compute used during the final pre-training run.[ref 93] This definition excludes subsequent (post-training) enhancements, such as fine-tuning and prompting methods, which can significantly improve capabilities (see supra Figure 1) using far less compute; many current methods can improve capabilities the equivalent of a 5x increase in training compute, while some can improve them by more than 20x.[ref 94]

The focus on training compute also misses the significance of compute used for inference, in which the trained model generates output in response to a prompt or new input data.[ref 95] Inference is the biggest compute cost for models deployed at scale, due to the frequency and volume of requests they handle.[ref 96] While developing an AI model is far more computationally intensive than a single inference request, it is a one-time task. In contrast, once a model is deployed, it may receive numerous inference requests that, in aggregate, exceed the compute expenditures of training. Some have even argued that inference compute could be a bottleneck in scaling AI, if inference compute costs scaling with training compute grow too large.[ref 97]

Greater availability of inference compute could enhance malicious uses of AI by allowing the model to process data more rapidly and enabling the operation of multiple instances in parallel. For example, AI could more effectively be used to carry out cyber attacks, such as a distributed denial-of-service (“DDoS”) attack,[ref 98] to manipulate financial markets,[ref 99] or to increase the speed, scale, and personalization of disinformation campaigns.[ref 100]

Compute used outside of development may also impact model performance. Specifically, some techniques can increase the performance of a model at the cost of more compute used during inference.[ref 101] Developers could therefore choose to improve a model beyond its current capabilities or to shift some compute expenditures from training to inference, in order to obtain equally-capable systems with less training compute. Users could also prompt a model to use similar techniques during inference, for example by (1) using “few-shot” prompting, in which initial prompts provide the model with examples of the desired output for a type of input,[ref 102] (2) using chain-of-thought prompting, which uses few-shot prompting to provide examples of reasoning,[ref 103] or (3) simply providing the same prompt multiple times and selecting the best result. Some user-side techniques to improve performance might increase the compute used during a single inference, while others would leave it unchanged (while still increasing the total compute used, due to multiple inferences being performed).[ref 104] Meanwhile, other techniques—such as pruning,[ref 105] weight sharing,[ref 106] quantization,[ref 107] and distillation[ref 108]—can reduce compute used during inference while maintaining or even improving performance, and they can further reduce inference compute at the cost of lower performance.

Beyond model characteristics such as parameter count, other factors can also affect the amount of compute used during inference in ways that may or may not improve performance, such as input size (compare a short prompt to a long document or high-resolution image) and batch size (compare one input provided at a time to many inputs in a single prompt).[ref 109] Thus, for a more accurate indication of model capabilities, compute used to run a single inference[ref 110] for a given set of prompts could be considered alongside other factors, such as training compute. However, doing so may be impractical, as data about inference compute (or architecture useful for estimating it) is rarely published by developers,[ref 111] different techniques could make inference more compute-efficient, and less information is available regarding the relationship between inference compute and capabilities.

While companies might be hesitant to increase inference compute at scale due to cost, doing so may still be worthwhile in certain circumstances, such as for more narrowly deployed models or those willing to pay more for improved capabilities. For example, OpenAI offers dedicated instances for users who want more control over system performance, with a reserved allocation of compute infrastructure and the ability to enable features such as longer context limits.[ref 112]

Over time, compute usage during the AI development and deployment process may change. It was previously common practice to train models with supervised learning, which uses annotated datasets. In recent years, there has been a rise in self-supervised, semi-supervised, and unsupervised learning, which use data with limited or no annotation but require more compute.[ref 113] 

III. The Role of Compute Thresholds for AI Governance

A. How Can Compute Thresholds Be Used in AI Policy?

Compute can be used as a proxy for the capabilities of AI systems, and compute thresholds can be used to define the limited subset of high-compute models subject to oversight or other requirements.[ref 114] Their use depends on the context and purpose of the policy. Compute thresholds serve as intuitive starting points to identify potential models of concern,[ref 115] perhaps alongside other factors.[ref 116] They operate as a trigger for greater scrutiny or specific requirements. Once a certain level of training compute is reached, a model is presumed to have a higher risk of displaying dangerous capabilities (and especially unknown dangerous capabilities) and, hence, is subject to stricter oversight and other requirements.

Compute thresholds have already entered AI policy. The EU AI Act requires model providers to assess and mitigate systemic risks, report serious incidents, conduct state-of-the-art tests and model evaluations, ensure cybersecurity, and report serious incidents if a compute threshold is crossed.[ref 117] Under the EU AI Act, a general-purpose model that meets the initial threshold is presumed to have high-impact capabilities and associated systemic risk.[ref 118]

In the United States, Executive Order 14,110 directed agencies to propose rules based on compute thresholds. Although it was revoked by President Trump’s Executive Order 14,148,[ref 119] many actions have already been taken and rules have been proposed for implementing Executive Order 14,110. For instance, the Department of Commerce’s Bureau of Industry and Security issued a proposed rule on September 11, 2024[ref 120] to implement the requirement that AI developers and cloud service providers report on models above certain thresholds, including information about (1) “any ongoing or planned activities related to training, developing, or producing dual-use foundation models,” (2) the results of red-teaming, and (3) the measures the company has taken to meet safety objectives.[ref 121] The executive order also imposed know-your-customer (“KYC”) monitoring and reporting obligations on U.S. cloud infrastructure providers and their foreign resellers, again with a preliminary compute threshold.[ref 122] On January 29, 2024, the Bureau of Industry and Security issued a proposed rule implementing those requirements.[ref 123] The proposed rule noted that training compute thresholds may determine the scope of the rule; the program is limited to foreign transactions to “train a large AI model with potential capabilities that could be used in malicious cyber-enabled activity,” and technical criteria “may include the compute used to pre-train the model exceeding a specified quantity.” [ref 124] The fate of these rules is uncertain, as all rules and actions taken pursuant to Executive Order 14,110 will be reviewed to ensure that they are consistent with the AI policy set forth in Executive Order 14,179, Removing Barriers to American Leadership in Artificial Intelligence.[ref 125] Any rules of actions identified as inconsistent are directed to be suspended, revised, or rescinded.[ref 126]

Numerous policy proposals have likewise called for compute thresholds. Scholars and developers alike have expressed support for a licensing or registration regime,[ref 127] and a compute threshold could be one of several ways to trigger the requirement.[ref 128] Compute thresholds have also been proposed for determining the level of KYC requirements for compute providers (including cloud providers).[ref 129] The Framework to Mitigate AI-Enabled Extreme Risks, proposed by U.S. Senators Romney, Reed, Moran, and King, would include a compute threshold for requiring notice of development, model evaluation, and pre-deployment licensing.[ref 130]

Other AI regulations and policy proposals do not explicitly call for the introduction of compute thresholds but could still benefit from them. A compute threshold could clarify when specific obligations are triggered in laws and guidance that refer more broadly to “advanced systems” or “systems with dangerous capabilities,” as in the voluntary guidance for “organizations developing the most advanced AI systems” in the Hiroshima Process International Code of Conduct for Advanced AI Systems, agreed upon by G7 leaders on October 30, 2023.[ref 131] Compute thresholds could identify when specific obligations are triggered in other proposals, including proposals for: (1) conducting thorough risk assessments of frontier AI models before deployment;[ref 132] (2) subjecting AI development to evaluation-gated scaling;[ref 133] (3) pausing development of frontier AI;[ref 134] (4) subjecting developers of advanced models to governance audits;[ref 135] (5) monitoring advanced models after deployment;[ref 136] and (6) requiring that advanced AI models be subject to information security protections.[ref 137]

B. Why Might Compute Be Relevant Under Existing Law?

Even without a formal compute threshold, the significance of training compute could affect the interpretation and application of existing laws. Courts and regulators may rely on compute as a proxy for how much risk a given AI system poses—alongside other factors such as capabilities, domain, safeguards, and whether the application is in a higher-risk context—when determining whether a legal condition or regulatory threshold has been met. This section briefly covers a few examples. First, it discusses the potential implications for duty of care and foreseeability analyses in tort law. It then goes on to describe how regulatory agencies could depend on training compute as one of several factors in evaluating risk from frontier AI, for example as an indicator of change to a regulated product and as a factor in regulatory impact analysis.

The application of existing laws and ongoing development of common law, such as tort law, may be particularly important while AI governance is still nascent[ref 138] and may operate as a complement to regulations once developed.[ref 139] However, courts and regulators will face new challenges as cases involve AI, an emerging technology of which they have no specialized knowledge, and parties will face uncertainty and inconsistent judgments across jurisdictions. As developments in AI unsettle existing law[ref 140] and agency practice, courts and agencies might rely on compute in several ways.

For example, compute could inform the duty of care owed by developers who make voluntary commitments to safety.[ref 141] A duty of care, which is a responsibility to take reasonable care to avoid causing harm to another, can be conditioned on the foreseeability of the plaintiff as a victim or be an affirmative duty to act in a particular way; affirmative duties can arise from the relationship between the parties, such as between business owner and customer, doctor and patient, and parent and child.[ref 142] If AI companies make general commitments to security testing and cybersecurity, such as the voluntary safety commitments secured by the Biden administration,[ref 143] those commitments may give rise to a duty of care in which training compute is a factor in determining what security is necessary. If a lab adopts a responsible scaling policy that requires it to have protection measures based on specific capabilities or potential for risk or misuse,[ref 144] a court might consider training compute as one of several factors in evaluating the potential for risk or misuse.

A court might also consider training compute as a factor when determining whether a harm was foreseeable. More advanced AI systems, trained with more compute, could foreseeably be capable of greater harm, especially in light of scaling laws discussed in Section I.C that make clear the relationship between compute and performance. It may likewise be foreseeable that a powerful AI system could be misused[ref 145] or become the target of more sophisticated attempts at exfiltration, which might succeed without adequate security.[ref 146] Foreseeability may in turn bear on negligence elements of proximate causation and duty of care.

Compute could also play a role in other scenarios, such as in a false advertising claim under the Lanham Act[ref 147] or state and federal consumer protection laws. If a business makes a claim about its AI system or services that is false or misleading, it could be held liable for monetary damages and enjoined from making that claim in the future (unless it becomes true).[ref 148] While many such claims will not involve compute, some may; for example, if a lab publicly claims to follow a responsible scaling policy, training compute could be relevant as an indicator of model capability and the corresponding security and safety measures promised by the policy.

Regulatory agencies may likewise consider compute in their analyses and regulatory actions. For example, the Environmental Protection Agency could consider training (and inference) compute usage as part of environmental impact assessments.[ref 149] Others could treat compute as a proxy for threat to national or public security. Agencies and committees responsible for identifying and responding to various risks, such as the Interagency Committee on Global Catastrophic Risk[ref 150] and Financial Stability Oversight Council,[ref 151] could consider compute in their evaluation of risk from frontier AI. Over fifty federal agencies were directed to take specific actions to promote responsible development, deployment, federal use of AI, and regulation of industry, in the government-wide effort established by Executive Order 14,110[ref 152]—although these actions are now under review.[ref 153] Even for agencies not directed to consider compute or implement a preliminary compute threshold, compute might factor into how guidance is implemented over time.

More speculatively, changes to training compute could be used by agencies as one of many indicators of how much a regulated product has changed, and thus whether it warrants further review. For example, the Food and Drug Administration might consider compute when evaluating AI in medical devices or diagnostic tools.[ref 154] While AI products considered to be medical devices are more likely to be narrow AI systems trained on comparatively less compute, significant changes to training compute may be one indicator that software modifications require premarket submission. The ability to measure, report, and verify compute[ref 155] could make this approach particularly compelling for regulators.

Finally, training compute may factor into regulatory impact analyses, which evaluate the impact of proposed and existing regulations through quantitative and qualitative methods such as cost-benefit analysis.[ref 156] While this type of analysis is not necessarily determinative, it is often an important input into regulatory decisions and necessary for any “significant regulatory action.”[ref 157] As agencies develop and propose new regulations and consider how those rules will affect or be affected by AI, compute could be relevant in drawing lines that define what conduct and actors are affected. For example, a rule with a higher compute threshold and narrower scope may be less significant and costly, as it covers fewer models and developers. The amount of compute used to train models now and in the future may be not only a proxy for threat to national security (or innovation, or economic growth), but also a source of uncertainty, given the potential for emergent capabilities.

C. Where Should the Compute Threshold(s) Sit?

The choice of compute threshold depends on the policy under consideration: what models are the intended target, given the purpose of the policy? What are the burdens and costs of compliance? Can the compute threshold be complemented with other elements for determining whether a model falls within the scope of the policy, in order to more precisely accomplish its purpose?

Some policy proposals would establish a compute threshold “at the level of FLOP used to train current foundational models.”[ref 158] While the training compute of many models is not public, according to estimates, the largest models today were trained with 1e25 FLOP or more, including at least one open-source model, Llama 3.1 405B.[ref 159] This is the initial threshold established by the EU AI Act. Under the Act, general-purpose AI models are considered to have “systemic risk,” and thus trigger a series of obligations for their providers, if found to have “high impact capabilities.”[ref 160] Such capabilities are presumed if the cumulative amount of training compute, which includes all “activities and methods that are intended to enhance the capabilities of the model prior to deployment, such as pre-training, synthetic data generation and fine-tuning,” exceeds 1e25 FLOP.[ref 161] This threshold encompasses existing models such as Gemini Ultra and GPT-4, and it can be updated upwards or downwards by the European Commission through delegated acts.[ref 162] During the AI Safety Summit held in 2023, the U.K. Government included current models by defining “frontier AI” as “highly capable general-purpose AI models that can perform a wide variety of tasks and match or exceed the capabilities present in today’s most advanced models” and acknowledged that the definition included the models underlying ChatGPT, Claude, and Bard.[ref 163]

Others have proposed an initial threshold of “more training compute than already-deployed systems,”[ref 164] such as 1e26 FLOP[ref 165] or 1e27 FLOP.[ref 166] No known model currently exceeds 1e26 FLOP training compute, which is roughly five times the compute used to train GPT-4.[ref 167] These higher thresholds would more narrowly target future systems that pose greater risks, including potential catastrophic and existential risks.[ref 168] President Biden’s Executive Order on AI[ref 169] and recently-vetoed California Senate Bill 1047[ref 170] are in line with these proposals, both targeting models trained with more than 1e26 OP or FLOP.

Far more models would fall within the scope of a compute threshold set lower than current frontier models. While only two models exceeded 1e23 FLOP training compute in 2017, over 200 models meet that threshold today.[ref 171] As discussed in Section II.A, compute thresholds operate as a trigger for additional scrutiny, and more models falling within the ambit of regulation would entail a greater burden not only on developers, but also on regulators.[ref 172] These smaller, general-purpose models have not yet posed extreme risks, making a lower threshold unwarranted at this time.[ref 173]

While the debate has centered mostly around the establishment of a single training compute threshold, governments could adopt a pluralistic and risk-adjusted approach by introducing multiple compute thresholds that trigger different measures or requirements according to the degree or nature of risk. Some proposals recommend a tiered approach that would create fewer obligations for models trained on less compute. For example, the Responsible Advanced Artificial Intelligence Act of 2024 would require pre-registration and benchmarks for lower-compute models, while developers of higher-compute models must submit a safety plan and receive a permit prior to training or deployment.[ref 174] Multi-tiered systems may also incorporate a higher threshold beyond which no development or deployment can take place, with limited exceptions, such as for development at a multinational consortium working on AI safety and emergency response infrastructure[ref 175] or for training runs and models with strong evidence of safety.[ref 176]

Domain-specific thresholds could be established for models that possess capabilities or expertise in areas of concern and models that are trained using less compute than general-purpose models.[ref 177] A variety of specialized models are already available to advance research, trained on extensive scientific databases.[ref 178] As discussed in Part I.D, these models present a tremendous opportunity, yet many have also recognized the potential threat of their misuse to research, develop, and use chemical, biological, radiological, and nuclear weapons.[ref 179] To address these risks, President Biden’s Executive Order on AI, which set a compute threshold of 1e26 FLOP to trigger reporting requirements, set a substantially lower compute threshold of 1e23 FLOP for models trained “using primarily biological sequence data.”[ref 180] The Hiroshima Process International Code of Conduct for Advanced AI Systems likewise recommends devoting particular attention to offensive cyber capabilities and chemical, biological, radiological, and nuclear risks, although it does not propose a compute threshold.[ref 181]

While domain-specific thresholds could be useful for a variety of policies tailored to specific risks, there are some limitations. It may be technically difficult to verify how much biological sequence data (or other domain-specific data) was used to train a model.[ref 182] Another challenge is specifying how much data in a given domain causes a model to fall within scope, particularly considering the potential capabilities of models trained on mixed data.[ref 183] Finally, the amount of training compute required may be so low that, over time, a compute threshold is not practical.

When choosing a threshold, regulators should be aware that capabilities might be substantially improved through post-training enhancements, and training compute is only a general predictor of capabilities. The absolute limits are unclear at this point; however, current methods can result in capability improvements equivalent to a 5- to 30-times increase in training.[ref 184] To account for post-training enhancements, a governance regime could create a safety buffer, in which oversight or other protective measures are set at a lower threshold.[ref 185] Along similar lines, open-source models may warrant a lower threshold for at least some regulatory requirements, since they could be further trained by another actor and, once released, cannot be moderated or rescinded. [ref 186]

D. Does a Compute Threshold Require Updates?

Once established, compute thresholds and related criteria will likely require updates over time.[ref 187] Improvements in algorithmic efficiency could reduce the amount of compute needed to train an equally capable model,[ref 188] or a threshold could be raised or eliminated if adequate protective measures are developed or if models trained with a certain amount of compute are demonstrated to be safe.[ref 189] To further guard against future developments in a rapidly evolving field, policymakers can authorize regulators to update compute thresholds and related criteria.[ref 190]

Several policies, proposed and enacted, have incorporated a dynamic compute threshold. For example, President Biden’s Executive Order on AI authorized the Secretary of Commerce to update the initial compute threshold set in the order, as well as other technical conditions for models subject to reporting requirements, “as needed on a regular basis” while establishing an interim compute threshold of 1e26 OP or FLOP.[ref 191] Similarly, the EU AI Act provides that the 1e25 FLOP compute threshold “should be adjusted over time to reflect technological and industrial changes, such as algorithmic improvements” and authorizes the European Commission to amend the threshold and “supplement benchmarks and indicators in light of evolving technological developments.”[ref 192] The California Senate Bill 1047 would have created the Frontier Model Division within the Government Operations Agency and authorized it to “update both of the [compute] thresholds in the definition of a ‘covered model’ to ensure that it accurately reflects technological developments, scientific literature, and widely accepted national and international standards and applies to artificial intelligence models that pose a significant risk of causing or materially enabling critical harms.”[ref 193]

Regulators may need to update compute thresholds rapidly. Historically, failure to quickly update regulatory definitions in the context of emerging technologies has led to definitions becoming useless or even counterproductive.[ref 194] In the field of AI, developments may occur quickly and with significant implications for national security and public health, making responsive rulemaking particularly important. In the United States, there are several statutory tools to authorize and encourage expedited and regular rulemaking.[ref 195] For example, Congress could expressly authorize interim or direct final rulemaking, which would enable an agency to shift the comment period in notice-and-comment rulemaking to take place after the rule has already been promulgated, thereby allowing them to respond quickly to new developments.[ref 196]

Policymakers could also require a periodic evaluation of whether compute thresholds are achieving their purpose to ensure that it does not become over- or under-inclusive. While establishing and updating a compute threshold necessarily involves prospective ex ante impact assessment, in order to take precautions against risk without undue burdens, regulators can learn much from retrospective ex post analysis of current and previous thresholds.[ref 197] In a survey conducted for the Administrative Conference of the United States, “[a]ll agencies stated that periodic reviews have led to substative [sic] regulatory improvement at least some of time. This was more likely when the underlying evidence basis for the rule, particularly the science or technology, was changing.”[ref 198] While the optimal frequency of periodic review is unknown, the study found that U.S. federal agencies were more likely to conduct reviews when provided with a clear time interval (“at least every X years”).[ref 199]

Several further institutional and procedural factors could affect whether and how compute thresholds are updated. In order to effectively update compute thresholds and other criteria, regulators must have access to expertise and talent through hiring, training, consultation and collaboration, and other avenues that facilitate access to experts from academia and industry.[ref 200] Decisions will be informed by the availability of data, including scientific and commercial data, to enable ongoing monitoring, learning, analysis, and adaptation in light of new developments. Decision-making procedures, agency design, and influence and pressures from policymakers, developers, and other stakeholders will likewise affect updates, among many other factors.[ref 201] While more analysis is beyond the scope of this Article, others have explored procedural and substantive measures for adaptive regulation[ref 202] and effective governance of emerging technologies.[ref 203]

Some have proposed defining compute thresholds in terms of effective compute,[ref 204] as an alternative to updates over time. Effective compute could index to a particular year (similar to inflation adjustments) and thus account for the role that algorithmic progress (e.g., 1e25 of 2023-level effective compute).[ref 205] However, there is not an agreed upon way to more precisely define and calculate effective compute, and the ability to do so depends on the challenging task of calculating algorithmic efficiency, including choosing a performance metric to anchor on. Furthermore, effective compute alone would fail to address potential changes in the risk landscape, such as the development of protective measures.

E. What Are the Advantages and Limitations of a Training Compute Threshold?

Compute has several properties that make it attractive for policymaking: it is (1) correlated with capabilities and thus risk, (2) essential for training, with thresholds that are difficult to circumvent without reducing performance, (3) an objective and quantifiable measure, (4) capable of being estimated before training (5) externally verifiable after training, and (6) a significant cost during development and thus indicative of developer resources. However, training compute thresholds are not infallible: (1) training compute is an imprecise indicator of potential risk, (2) a compute threshold could be circumvented, and (3) there is no industry standard for measuring and reporting training compute.[ref 206] Some of these limitations can be addressed with thoughtful drafting, including clear language, alternative and supplementary elements for defining what models are within scope, and authority to update any compute threshold and other criteria in light of future developments.

First, training compute is correlated with model capabilities and associated risks. Scaling laws predict an increase in performance as training compute increases, and real-world capabilities generally follow (Section I.C). As models become more capable, they may also pose greater risks if they are misused or misaligned (Section I.D). However, training compute is not a precise indicator of downstream capabilities. Capabilities can seemingly emerge abruptly and discontinuously as models are developed with more compute,[ref 207] and the open-ended nature of foundation models means those capabilities may go undetected.[ref 208] Post-training enhancements such as fine-tuning are often not considered a part of training compute, yet they can dramatically improve performance and capabilities with far less compute. Furthermore, not all models with dangerous capabilities require large amounts of training compute; low-compute models with capabilities in certain domains, such as biology or chemistry, may also pose significant risks, such as biological design tools that could be used for drug discovery or the creation of pathogens worse than any seen to date.[ref 209] The market may shift towards these smaller, cheaper, more specialized models,[ref 210] and even general-purpose low-compute models may come to pose significant risks. Given these limitations, a training compute threshold cannot capture all possible risks; however, for large, general-purpose AI models, training compute can act as an initial threshold for capturing emerging capabilities and risks.

Second, compute is necessary throughout the AI lifecycle, and a compute threshold would be difficult to circumvent. There is no AI without compute (Section I.A). Due to its relationship with model capabilities, training compute cannot be easily reduced without a corresponding reduction in capabilities, making it difficult to circumvent for developers of the most advanced models. Nonetheless, companies might find “creative ways” to account for how much compute is used for a given system in order to avoid being subject to stricter regulation.[ref 211] To reduce this risk, some have suggested monitoring compute usage below these thresholds to help identify circumvention methods, such as structuring techniques or outsourcing.[ref 212] Others have suggested using compute thresholds alongside additional criteria, such as the model’s performance on benchmarks, financial or energy cost, or level of integration into society.[ref 213] As in other fields, regulatory burdens associated with compute thresholds could encourage regulatory arbitrage if a policy does not or cannot effectively account for that possibility.[ref 214] For example, since compute can be accessed remotely via digital means, data centers and compute providers could move to less-regulated jurisdictions.

Third, compute is an objective and quantifiable metric that is relatively straightforward to measure. Compute is a quantitative measure that reflects the number of mathematical operations performed. It does not depend on specific infrastructure and can be compared across different sets of hardware and software.[ref 215] By comparison, other metrics, such as algorithmic innovation and data, have been more difficult to track.[ref 216] Whereas quantitative metrics like compute can be readily compared across different instances, the qualitative nature of many other metrics makes them more subject to interpretation and difficult to consistently measure. Compute usage can be measured internally with existing tools and systems; however, there is not yet an industry standard for measuring, auditing, and reporting the use of computational resources.[ref 217] That said, there have been some efforts toward standardization of compute measurement.[ref 218] In the absence of a standard, some have instead presented a common framework for calculating compute, based on information about the hardware used and training time.[ref 219]

Fourth, compute can be estimated ahead of model development and deployment. Developers already estimate training compute with information about the model’s architecture and amount of training data, as part of planning before training takes place. The EU AI Act recognizes this, noting that “training of general-purpose AI models takes considerable planning which includes the upfront allocation of compute resources and, therefore, providers of general-purpose AI models are able to know if their model would meet the threshold before the training is completed.”[ref 220] Since compute can be readily estimated before a training run, developers can plan a model with existing policies in mind and implement appropriate precautions during training, such as cybersecurity measures.

Fifth, the amount of compute used could be externally verified after training. While laws that use compute thresholds as a trigger for additional measures could depend on self-reporting, meaningful enforcement requires regulators to be aware of or at least able to verify the amount of compute being used. A regulatory threshold will be ineffective if regulators have no way of knowing whether a threshold has been reached. For this reason, some scholars have proposed that developers and compute providers be required to report the amount of compute used at different stages of the AI lifecycle.[ref 221] Compute providers already employ chip-hours for client billing, which could be used to calculate total computational operations,[ref 222] and the centralization of a few key cloud providers could make monitoring and reporting requirements simpler to administer.[ref 223] Others have proposed using “on-chip” or “hardware-enabled governance mechanisms” to verify claims about compute usage.[ref 224]

Sixth, training compute is an indicator of developer resources and capacity to comply with regulatory requirements, as it represents a substantial financial investment.[ref 225] For instance, Sam Altman reported that the development of GPT-4 cost “much more” than $100 million.[ref 226] Researchers have estimated that Gemini Ultra cost $70 million to $290 million to develop.[ref 227] A regulatory approach based on training compute thresholds can therefore be used to subject only the most resourced AI developers to increased regulatory scrutiny, while avoiding overburdening small companies, academics, and individuals. Over time, the cost of compute will most likely continue to fall, meaning the same thresholds will capture more developers and models. To ensure that the law remains appropriately scoped, compute thresholds can be complemented by additional metrics, such as the cost of compute or development. For example, the vetoed California Senate Bill 1047 was amended to include a compute cost threshold, defining a “covered model” to include one trained with over 1e26 OP, only if the cost of that training compute exceeded $100,000,000 at the start of training.[ref 228]

At the time of writing, many consider compute thresholds to be the best option currently available for determining which AI models should be subject to regulation, although the limitations of this approach underscore the need for careful drafting and adaptive governance. When considering the legal obligations imposed, the specific compute threshold should correspond to the nature and extent of additional scrutiny and other requirements and reflect the fact that compute is only a proxy for, and not a precise measure of, risk.

F. How Do Compute Thresholds Compare to Capability Evaluations?

A regulatory approach that uses a capabilities-based threshold or evaluation may seem more intuitively appealing and has been proposed by many.[ref 229] There are currently two main types of capability evaluations: benchmarking and red-teaming.[ref 230] In benchmarking, a model is tested on a specific dataset and receives a numerical score. In red-teaming, evaluators can use different approaches to identify vulnerabilities and flaws in a system, such as through prompt injection attacks to subvert safety guardrails. Model evaluations like these already serve as the basis for responsible scaling policies, which specify what protective measures an AI developer must implement in order to safely handle a given level of capabilities. Responsible scaling policies have been adopted by companies like Anthropic, OpenAI, and Google, and policymakers have also encouraged their development and practice.[ref 231]

Capability evaluations can complement compute thresholds. For example, capability evaluations could be required for models exceeding a compute threshold that indicates that dangerous capabilities might exist. They could also be used as an alternative route to being covered by regulation. The EU AI Act adopts the latter approach, complementing the compute threshold with the possibility for the European Commission to “take individual decisions designating a general-purpose AI model as a general-purpose AI model with systemic risk if it is found that such model has capabilities or an impact equivalent to those captured by the set threshold.”[ref 232]

Nonetheless, there are several downsides to depending on capabilities alone. First, model capabilities are difficult to measure.[ref 233] Benchmark results can be affected by factors other than capabilities, such as benchmark data being included during training[ref 234] and model sensitivity to small changes in prompting.[ref 235] Downstream capabilities of a model may also differ from those during evaluation due to changes in dataset distribution.[ref 236] Some threats, such as misuse of a model to develop a biological weapon, may be particularly difficult to evaluate due to the domain expertise required, the sensitivity of information related to national security, and the complexity of the task.[ref 237] For dangerous capabilities such as deception and manipulation, the nature of the capability makes it difficult to assess,[ref 238] although some evaluations have already been developed.[ref 239] Furthermore, while evaluations can point to what capabilities do exist, it is far more difficult to prove that a model does not possess a given capability. Over time, new capabilities may even emerge and improve due to prompting techniques, tools, and other post-training enhancements.

Second, and compounding the issue, there is no standard method for evaluating model capabilities.[ref 240] While benchmarks allow for comparison across models, there are competing benchmarks for similar capabilities; with none adopted as standard by developers or the research community, evaluators could select different benchmark tests entirely.[ref 241] Red-teaming, while more in-depth and responsive to differences in models, is even less standardized and provides less comparable results. Similarly, no standard exists for when during the AI lifecycle a model is evaluated, even though fine-tuning and other post-training enhancements can have a significant impact on capabilities. Nevertheless, there have been some efforts toward standardization, including the U.S. National Institute of Standards and Technology beginning to develop guidelines and benchmarks for evaluating AI capabilities, including through red-teaming.[ref 242]

Third, it is much more difficult to externally verify model evaluations. Since evaluation methods are not standardized, different evaluators and methods may come to different conclusions, and even a small difference could determine whether a model falls within the scope of regulation. This makes external verification simultaneously more important and more challenging. In addition to the technical challenge of how to consistently verify model evaluations, there is also a practical challenge: certain methods, such as red-teaming and audits, depend on far greater access to a model and information about its development. Developers have been reluctant to grant permissive access,[ref 243] which has contributed to numerous calls to mandate external evaluations.[ref 244]

Fourth, model evaluations may be circumvented. For red-teaming and more comprehensive audits, evaluations for a given model may reasonably reach different conclusions, which allows room for an evaluator to deliberately shape results through their choice of methods and interpretation. Careful institutional design is needed to ensure that evaluations are robust to conflicts of interest, perverse incentives, and other limitations.[ref 245] If known benchmarks are used to determine whether a model is subject to regulation, developers might train models to achieve specific scores without affecting capabilities, whether to improve performance on safety measures or strategically underperform on certain measures of dangerous capabilities.

Finally, capability evaluations entail more uncertainty and expense. Currently, the capabilities of a model can only reliably be determined ex post,[ref 246] making it difficult for developers to predict whether it will fall within the scope of applicable law. More in-depth model evaluations such as red-teaming and audits are expensive and time-consuming, which may constrain small organizations, academics, and individuals.[ref 247]

Capability evaluations can thus be viewed as a complementary tool for estimating model risk. While training compute makes an excellent initial threshold for regulatory oversight, as an objective and quantifiable measure that can be estimated prior to training and verified after, capabilities correspond more closely to risk. Capability evaluations provide more information and can be completed after fine-tuning and other post-training enhancements, but are more expensive, difficult to carry out, and less standardized. Both are important components of AI governance but serve different roles.

IV. Conclusion

More powerful AI could bring transformative changes in society. It promises extraordinary opportunities and benefits across a wide range of sectors, with the potential to improve public health, make new scientific discoveries, improve productivity and living standards, and accelerate economic growth. However, the very same advanced capabilities could result in tremendous harms that are difficult to control or remedy after they have occurred. AI could fail in critical infrastructure, further concentrate wealth and increase inequality, or be misused for more effective disinformation, surveillance, cyberattacks, and development of chemical and biological weapons.

In order to prevent these potential harms, laws that govern AI must identify models that pose the greatest threat. The obvious answer would be to evaluate the dangerous capabilities of frontier models; however, state of the art model evaluations are subjective and unable to reliably predict downstream capabilities, and they can take place only after the model has been developed with a substantial investment.

This is where training compute thresholds come into play. Training compute can operate as an initial threshold for estimating the performance and capabilities of a model and, thus, the potential risk it poses. Despite its limitations, it may be the most effective option we have to identify potentially dangerous AI that warrants further scrutiny. However, compute thresholds alone are not sufficient. They must be used alongside other tools to mitigate and respond to risk, such as capability evaluations, post-market monitoring, and incident reporting. Further research avenues could develop better governance via compute thresholds:

  1. What amount of training compute corresponds to future systems of concern? What threshold is appropriate for different regulatory targets, and how can we identify that threshold in advance? What are the downstream effects of different compute thresholds?
  2. Are compute thresholds appropriate for different stages of the AI lifecycle? For example, could thresholds for compute used for post-training enhancements or during inference be used alongside a training compute threshold, given the ability to significantly improve capabilities at these stages?
  3. Should domain-specific compute thresholds be established, and if so, to address which risks? If domain-specific compute thresholds are established, such as in President Biden’s Executive Order 14,110, how can competent authorities determine if a system is domain-specific and verify the training data?
  4. How should compute usage be reported, monitored, and audited?
  5. How should a compute threshold be updated over time? What is the likelihood of future frontier systems being developed using less (or far less) compute than is used today? Does growth or slowdown in compute usage, hardware improvement, or algorithmic efficiency warrant an update, or should it correspond solely to an increase in capabilities? Relatedly, what kind of framework would allow a regulatory agency to respond to developments effectively (e.g., with adequate information and the ability to update rapidly)?
  6. How could a capabilities-based threshold complement or replace a compute threshold, and what would be necessary (e.g., improved model evaluations for dangerous capabilities and alignment)?
  7. How should the law mitigate risks from AI systems that sit below the training compute threshold?

The National Security Memo on AI: what to expect in Trump 2.0

Any opinions expressed in this post are those of the author and do not reflect the views of the Institute for Law & AI or the U.S. Department of Defense.

On October 24, 2024, President Biden’s National Security Advisor Jake Sullivan laid out the U.S. government’s “first-ever strategy for harnessing the power and managing the risks of AI to advance [U.S.] national security.”[ref 1] The National Security Memorandum on AI (NSM) was initially seen as a major development in U.S. national security policy, but, following former President Donald Trump’s victory in the 2024 election, it is unclear what significance the NSM retains. If he is so inclined, President Trump can rescind the NSM on his first day in office, as he has promised to do with President Biden’s 2023 AI Executive Order (EO). But national security has traditionally been a policy space with a significant degree of continuity between administrations, and at least some of the policy stances embodied in the NSM seem consistent with the first Trump administration’s approach to issues at the intersection of AI and national security. 

So, does the NSM still matter? Will the incoming administration repeal it completely, or merely amend certain provisions while leaving others in place? And what, if anything, might any repealed provisions be replaced with? While other authors have already provided comprehensive analyses of the NSM’s provisions and its accompanying framework, none have focused their assessments on how the documents will fare under the incoming administration. This blog post attempts to fill that gap by analyzing how President Trump and his key advisors may change or continue some of the NSM’s most significant provisions. In summary:

Background

Created in response to a directive in President Biden’s AI EO,[ref 2] the National Security Memorandum on AI (NSM) was a major national security policy priority for the Biden administration. Few technologies over the last 75 years have received similar top-level, interagency attention; the Biden administration officials who designed the NSM have said that they took inspiration from historical efforts to compete against the Soviets in nuclear and space technologies. The NSM is detailed, specific, and lengthy, coming in at more than twice the length of any other national security memorandum issued by the Biden administration other than the National Security Memorandum on Critical Infrastructure Security and Resilience (NSM-22). 

Relative to some of the Biden administration’s other AI policy documents, the NSM more narrowly focuses on the strategic consequences of AI for U.S. national security. It identifies AI as an “era-defining technology”[ref 3] and paints a picture of the United States in a great power competition that, at its core, is a struggle for technological supremacy.[ref 4] The NSM argues that, if the United States does not act now using a coordinated, responsible, and whole-of-society approach to take advantage of AI advances, it “risks losing ground to strategic competitors”[ref 5] and that this lost technological edge “could threaten U.S. national security, bolster authoritarianism worldwide, undermine democratic institutions and processes, facilitate human rights abuses, and weaken the rules-based international order.”[ref 6]

Where previous Biden administration AI documents either took a non-sector-specific approach,[ref 7] excluded non-national security systems,[ref 8] focused guidance narrowly on autonomous and semi-autonomous weapon systems,[ref 9] or provided high-level principles rather than concrete direction,[ref 10] the NSM requires follow-through by all government agencies across the national security enterprise and helps enable that follow-through with concrete implementation guidance. Specifically, the NSM includes more than 80 compulsory assignments[ref 11] to relevant agencies in support of efforts to promote and secure U.S. leadership in AI (focusing particularly on frontier AI models[ref 12]), harness AI to achieve U.S. national security goals, and engage with other countries and multilateral organizations to influence the course of AI development efforts around the world in a direction consistent with U.S. values and interests.[ref 13] Those assignments to agencies seek to accelerate domestic AI development while slowing the development of U.S. adversaries’ capabilities and managing technological risks, including “AI safety, security, and trustworthiness.”[ref 14] Inside the national security enterprise, the NSM seeks to enable effective and responsible AI use while ensuring agencies can manage the technology’s risks.

To the same ends, the NSM provides and requires agencies to follow a separate[ref 15] governance and risk management framework for “AI used as a component of a National Security System.”[ref 16] The framework sets concrete boundaries for national security agencies’ responsible adoption of AI systems in several ways.[ref 17] First, it delineates AI use restrictions and minimum risk management safeguards for specific use cases, ensuring agencies know what they can and cannot legally use AI for and when they must take more thorough risk reduction measures before a given stage of the AI lifecycle. The framework also requires agencies to catalog and monitor their AI use, facilitating awareness and accountability for all AI uses up the chain of command. Lastly, the framework requires agencies to establish standardized training and accountability requirements and guidelines to ensure their personnel’s responsible use and development of AI.

Logistics of a Repeal

Presidents are generally free to revoke, replace, or modify the presidential memoranda issued by their predecessors as they choose, without permission from Congress. To repeal the NSM, President Trump could issue a new memorandum rescinding the entire NSM (and the accompanying framework) or repealing certain provisions while retaining others. A new Executive Order, potentially with the broader purpose of repealing the Biden AI EO, could serve the same function. Both of these options would typically include a policy review led by the National Security Council to assess the status quo and recommend updates, though each presidential administration has revised the exact process to fit their needs. If the NSM does not end up being a top priority, President Trump could also informally direct[ref 18] national security agency heads to stop or change their implementation of some of the NSM’s provisions before he issues a formal policy document. 

Bias and Discrimination

The first NSM provisions on the chopping block will, in all likelihood, be those that focus on bias and discrimination. President Trump and conservatives across the board have vowed to “stop woke and weaponized government” and generally view many of the Biden administration’s policies in this arena as harming U.S. competitiveness and growth, stifling free speech, and negatively impacting U.S. homeland and national security. In the AI context, the 2024 GOP Platform promised to repeal the Biden EO, stating that it “hinders AI Innovation,… imposes Radical Leftwing ideas on the development of this technology,” and restricts freedom of speech.

While not as focused on potentially controversial social issues as some other Biden administration AI policy documents,[ref 19] the NSM does contain several provisions to which the Trump administration will likely object. Specifically, the incoming administration seems poised to cut the NSM’s recognition of “discrimination and bias” as one of nine core AI risk categories[ref 20] that agency heads must “monitor, assess, and mitigate” in their agencies’ development and use of AI.[ref 21] Additionally, the incoming administration may repeal or revise M-24-10—a counterpart to the NSM’s framework that addresses AI risks outside of the national security context—effectively preventing the NSM’s framework from incorporating M-24-10’s various “rights-impacting” use cases.[ref 22] These provisions are easily severable from the current NSM and its framework.

“Safe, Secure, and Trustworthy” AI

One primary focus of the NSM is facilitating the “responsible” adoption of AI by promoting the “safety, security, and trustworthiness” of AI systems through risk management practices, standard-setting, and safety evaluations. In many ways, the NSM’s approach in these sections is consistent with aspects of the first Trump administration’s AI policy, but there is growing conservative and industry support for a deregulatory approach to speed up AI adoption.

The first Trump administration kickstarted federal government efforts to accelerate AI development and adoption with two AI-related executive orders issued in 2019 and 2020. At the time, the administration saw trust and safety as important factors for facilitating adoption of AI technology; the 2019 EO noted that “safety and security concerns” were “barriers to, or requirements associated with” widespread AI adoption, emphasized the need to “foster public trust and confidence in AI technologies and protect civil liberties, privacy, and American values,” and required the National Institute of Standards and Technology (NIST) to develop a plan for the federal government to assist in the development of technical standards “in support of reliable, robust, and trustworthy” AI systems.[ref 23] The 2020 EO sought to “promot[e] the use of trustworthy AI in the federal government” in non-national security contexts and, in service of this goal, articulated principles for the use of AI by government agencies. According to the 2020 EO, agency use of AI systems should be “safe, secure, and resilient,” “transparent,” “accountable,” and “responsible and traceable.”[ref 24] 

The Biden administration continued along a similar course,[ref 25] focusing on the development of soft law mechanisms to mitigate AI risks, including voluntary technical standards, frameworks, and agreements. Echoing the 2019 Trump EO, the Biden NSM argues that standards for safety, security, and trustworthiness will speed up adoption “thanks to [the] increased certainty, confidence, and compatibility” they bring.

But 2020 was a lifetime ago in terms of AI policy. Ultimately, the real question is not whether the second Trump administration thinks that safety, security, and trustworthiness are relevant, but rather whether the NSM provisions relating to trustworthy AI are viewed as, at the margins, facilitating adoption or hindering innovation. While there is certainly overlap between the two administration’s views, some conservatives have objected to the Biden administration’s AI policy outside of the national security context on the grounds that it focused on safety, security, and trustworthiness primarily for the sake of preventing various harms instead of as a means to encourage and facilitate AI adoption.[ref 26] Others have expressed skepticism regarding discussions of “trust and safety” on the grounds that large tech companies might use safety concerns to stymie competition, ultimately leading to reduced innovation and harm to consumers. In particular, the mandatory reporting requirements placed on AI companies by President Biden’s 2023 EO faced conservative opposition; the 2024 GOP platform asserts that the EO will “hinder[] AI innovation” and promises to overturn it.

Concretely, the NSM requires agencies in the national security enterprise to use its accompanying risk management framework as they implement AI systems; to conduct certain evaluations and testing of AI systems; to monitor, assess, and mitigate AI-related risks; to issue and regularly update agency-specific AI governance and risk management guidance; and to appoint Chief AI Officers and establish AI Governance Boards.[ref 27] The NSM intends these Officers and Boards to ensure accountability, oversight, and transparency in the implementation of the NSM’s framework.[ref 28] The NSM also designates NIST’s AI Safety Institute (AISI) to “serve as the primary United States government point of contact with private sector AI developers to facilitate voluntary pre- and post-public-deployment testing for safety, security, and trustworthiness,” conduct voluntary preliminary pre-deployment testing on at least two frontier AI models, create benchmarks for assessing AI system capabilities, and issue guidance on testing, evaluation, and risk management.[ref 29] Various other agencies with specific expertise are required to provide “classified sector-specific evaluations” of advanced AI models for cyber, nuclear, radiological, biological, and chemical risks.[ref 30]

Unlike the 2023 Biden EO, which invoked the Defense Production Act to impose its mandatory reporting requirements on private companies, the NSM’s provisions on safe, secure, and trustworthy AI impose no mandatory obligations on private companies. This, in addition to the NSM’s national security focus, might induce the Trump administration to leave most of these provisions in effect.[ref 31] However, not all members of the incoming administration may view such a focus on risk management, standards, testing, and evaluation as the best path to AI adoption across the national security enterprise. If arguments for a deregulatory approach toward AI adoption win the day, these NSM provisions and possibly the entire memorandum could face a full repeal. Regardless of the exact approach, the incoming administration seems likely to keep AI innovation as its north star, taking an affirmative approach focused on the benefits enabled by AI and on safety, security, and trustworthiness instrumentally to the degree the administration judges necessary to enable U.S. AI leadership.

Responding to Foreign Threats, Particularly from China

President Trump seems likely to maintain or expand the NSM directives aimed at impeding the AI development efforts of China and other U.S. adversaries. Over the last decade, Washington has seen bipartisan consensus behind efforts to respond to economic gray zone tactics[ref 32] used by U.S. adversaries, particularly China, to compete with the United States. These tactics have included university research partnerships, cyber espionage, insider threats, and both foreign investments in U.S. companies and aggressive headhunting of those companies’ employees to facilitate technology transfer. The NSM builds upon efforts to combat these gray zone tactics by reassessing U.S. intelligence priorities with an eye toward focusing on the U.S. AI ecosystem, strengthening inbound investment screening, and directing the Intelligence Community (IC) to focus on risks to the AI supply chain. If President Trump does not elect to repeal the NSM in its entirety, it seems likely that he will build upon each of these NSM provisions, although potentially applying an approach that more explicitly targets China.[ref 33] 

The NSM requires a review and recommendations for revision of the Biden administration’s intelligence priorities, incorporating into those priorities risks to the U.S. AI ecosystem and enabling sectors.[ref 34] The recommendations, which the White House will likely complete before the inauguration,[ref 35] will likely help inform the incoming administration’s intelligence priorities. Though this implementation will not make headlines, it could significantly strengthen the incoming administration’s enforcement efforts by enabling better strategies and targeting decisions for export controls, tariffs (including possible component tariffs), outbound investment restrictions, and other measures

Additionally, the NSM strengthens inbound investment screening by requiring the Committee on Foreign Investment in the United States (CFIUS) to consider, as part of its screenings, whether a given transaction involves foreign actor access to proprietary information related to any part of the AI lifecycle.[ref 36] This provision is consistent with President Trump’s strengthening of CFIUS during his first term—both by championing the Foreign Investment Risk Review Modernization Act (FIRRMA), which expanded CFIUS’s jurisdiction and review process, and by increasing scrutiny on foreign acquisitions of U.S. tech companies, including semiconductor companies. By specifically requiring an analysis of risks related to access of proprietary AI information, this provision seems likely to increase scrutiny of AI-relevant foreign investments covered by CFIUS[ref 37] and to make it more likely that AI-related transactions will be blocked.

The NSM also requires the Intelligence Community to identify critical AI supply chain nodes, determine methods to disrupt or compromise those nodes, and act to mitigate related risks.[ref 38] This directive is consistent with the first Trump administration’s aggressive approach to the use of export controls—including through authorities from the Export Control Review Act (ECRA), which President Trump signed into law—and diplomacy to disrupt China’s ability to manufacture or acquire critical AI supply chain components. It also parallels Trump-era efforts to secure telecommunications supply chains. Increased IC scrutiny of AI supply chain nodes may provide intelligence allowing the United States and its allies and partners to better leverage their supply chain advantages, just as the Biden administration has attempted to do through multiple new export controls

Based on these consistencies across the last two administrations and public statements from President Trump’s incoming U.S. Trade Representative Jamieson Greer, the new administration seems poised to double down on the NSM’s combined efforts to protect against Chinese and other adversarial threats to the U.S. AI ecosystem. Congress also appears amenable to further strengthening the President’s ECRA authorities in support of possible Trump administration efforts—to cover AI systems and cloud compute providers that enable the training of AI models.[ref 39]  However, President Biden’s recent export controls have met with opposition from major players in the semiconductor supply chain and conservative open-source advocates. President Trump could also potentially use such restrictions as bargaining chips, easing restrictions in order to secure concessions from foreign competitors in other policy areas.

Because the NSM’s provisions related to foreign threats do not significantly affect open-source models, they seem unlikely to provoke many objections from the incoming administration, except to the extent that they do not go far enough or avoid explicitly identifying China.[ref 40] This does not necessarily mean that President Trump will avoid repealing them, however, as it remains possible that the incoming administration will find it more convenient to repeal the entire document and replace provisions as necessary than to pick and choose its targets.

Infrastructure

President Trump seems likely to expand upon or at least continue the NSM’s provisions that focus on developing the energy infrastructure necessary to meet expected future AI power needs (without the Biden Administration’s focus on clean power), strengthening domestic chip production, and making AI resources accessible to diverse actors while increasing the government’s efficiency in using its own AI resources. 

Bipartisan consensus exists around the need to build the infrastructure required to facilitate the development of next-generation AI systems. President Trump has already signaled that this issue will be one of his top priorities, and President Biden recently issued an Executive Order on Advancing United States Leadership in Artificial Intelligence Infrastructure.[ref 41] Although both parties recognize the importance of AI infrastructure, President Trump’s team has indicated that they intend to adopt a modified “energy dominance” version of this priority. Where the Biden administration sees the United States as being at risk of falling behind without additional clean power, the Trump administration views the nation as already behind the curve and needing to address its energy deficit with all types of power, including fossil fuels and nuclear energy. The incoming administration also sees power generation and the resulting lower energy prices as a potential asymmetric advantage for the United States in the “A.I. arms race.” Therefore, President Trump seems likely to significantly expand on the Biden administration’s efforts to provide power for the U.S. AI ecosystem. With respect to the NSM, this likely means that the provision requiring the White House Chief of Staff to coordinate the streamlining of permits, approvals, and incentives for the construction of AI-enabling infrastructure and supporting assets[ref 42] will survive, unless the NSM is repealed in its entirety.

Though the NSM does not focus on U.S. chip production infrastructure, National Security Advisor Sullivan pointed to the progress already made through the CHIPS and Science Act’s “generational investment in [U.S.] semiconductor manufacturing” in his speech announcing the memorandum. Under the incoming administration, however, the survival of that bipartisan effort is somewhat uncertain. While the legislation has received significant support from Republican members of Congress, President Trump has criticized the bill as being less efficient than tariffs, and he could delay or block the distribution of promised funds to chip companies. However, the concept of incentivizing foreign chip firms to build fabs in the United States was originally devised during the first Trump administration, and the first investment of Taiwan Semiconductor Manufacturing Company (TSMC) in Arizona came during Trump’s first term in office. Some commentators have argued that, at least for many segments of the chip industry, tariffs alone will not solve the United States’ chip problem. Given strong Republican support for CHIPS Act-funded U.S. factories and the national security case for such investments, it seems most likely that the incoming administration will continue to advance many of the bill’s infrastructure goals. Instead of attempting a broad reversal, President Trump might remove requirements from application guidelines that mandate that funding recipients provide child care, encourage union labor, and demonstrate environmental responsibility, including using renewable energy to operate their facilities.

The NSM also requires agencies to consider AI needs in their construction and renovation of federal compute facilities;[ref 43] begin a federated AI and data sources pilot project;[ref 44] and distribute compute, data, and other AI assets to actors who would otherwise lack access.[ref 45] As these assignments seem largely consistent with prior Trump efforts, they appear more likely than not to continue. The AI construction assessment requirement and the federated AI pilot appear to align with the incoming administration’s focus on efficiency. Additionally, President Trump signed into law the legislation that began the National AI Research Resource (NAIRR) during his first term and may continue to support its mission of democratizing access to AI assets, although potentially not at the levels requested by the Biden Administration’s Director of the Office of Science and Technology Policy.

Talent and Immigration

Whether the NSM provisions relating to high-skilled immigration survive under the new administration is uncertain, but non-immigration initiatives focused on AI talent seem likely to survive. 

The NSM aims to better recruit and retain AI talent at national security agencies by revising federal hiring and retention policies to accelerate responsible AI adoption,[ref 46] identifying education and training opportunities to increase the AI fluency across the national security workforce,[ref 47] establishing a National Security AI Executive Talent Committee,[ref 48] and conducting “an analysis of the AI talent market in the United States and overseas” to inform future AI talent policy choices.[ref 49] These initiatives seem consistent with actions taken in the previous Trump administration and, therefore, likely to survive in some form. For example, President Trump’s signature AI for the American Worker initiative focused on training and upskilling workers with AI-relevant skills. President Trump also signed the bill into law that established the National Security Commission on AI, which completed the most significant government analysis of the AI national security challenge and whose final report emphasized the importance of recruiting and retaining AI talent within the government’s national security enterprise.

The NSM also seeks to better compete for AI talent by directing relevant agencies both to “use all available legal authorities to assist in attracting and rapidly bringing to the United States” individuals who would increase U.S. competitiveness in “AI and related fields”[ref 50] and to convene agencies to “explore actions for prioritizing and streamlining administrative processing operations for all visa applicants working with sensitive technologies.”[ref 51] Specifically, this effort would likely involve continued work to expand and improve the H-1B visa process, as well as other potential skilled immigration pathways and policies like O-1A and J-1 visas, Optional Practical Training, the International Entrepreneur Rule, and the Schedule A list

President Trump’s position on high-skilled immigration and specifically the H-1B program appears to have softened since his first term, but it is unclear to what degree and how that will affect his policy decisions. On the campaign trail this year, President Trump stated his support for providing foreign graduates of U.S. universities and even “junior colleges” with green cards to stay in the United States, although his campaign later walked back the statement and clarified the need for “the most aggressive vetting process in U.S. history” before permitting graduates to stay. President Trump’s Senior Policy Advisor for AI Sriram Krishnan is a strong supporter of H-1B visa expansion. And most significantly, in response to the fiery online debate following the Krishnan announcement between President Trump’s pro-H-1B advisors Elon Musk and Vivek Ramaswamy and prominent critics of H-1B like Steve Bannon and Laura Loomer, President Trump reaffirmed his support for H-1B visas, saying, “we need smart people coming into our country.”

However, a significant portion of President Trump’s political base would prefer to shrink the H-1B program, as he did during his first term to protect American jobs. During that first term, President Trump repeatedly cut H-1B visas for skilled immigrants, including through his “Buy American, Hire American” Executive Order and interim H-1B program revision.[ref 52] His former Senior Advisor Stephen Miller and U.S. Immigration and Customs Enforcement Director Tom Homan, who were major proponents of these H-1B cuts, will serve in the new administration as Homeland Security Advisor and “border czar,” respectively. These posts will likely allow them to exert significant influence on the President’s immigration decisions and, potentially, to prevail over Trump’s supporters in Silicon Valley and other potential proponents of highly skilled immigration like Jared Kushner and UN Ambassador nominee Elise Stefanik. Additionally, cracking down on immigration in order to “put American workers first” was a core element of the 2024 Republican Party platform.

One key early indicator of the direction President Trump leans will be his decision to continue or attempt to roll back the Biden administration’s long-awaited revision to the H-1B program, which went into effect the last business day before President Trump’s inauguration and includes attempts to streamline the approvals process, increase flexibility, and strengthen oversight.

Conclusion

It is clear that AI will be a key part of the incoming administration’s national security policy. Across his campaign, President Trump prioritized developing domestic AI infrastructure, particularly energy production, and, since winning the election, he has prioritized the appointment of multiple high-level AI advisors.

However, while some of the incoming administration’s responses to the NSM seem locked in—notably, removing provisions relating to discrimination and bias, building on the NSM’s shift toward increasing U.S. power production to support AI energy needs, and continued efforts to slow China’s development of advanced AI systems—there are also key areas where the administration’s responses remain uncertain. Regardless of how the Trump administration’s policies at the intersection of AI and national security shake out, its response to the NSM will serve as a useful early indicator of what direction those policies will take.

Insuring emerging risks from AI

What should be internationalised in AI governance?