Balancing safety and privacy: regulatory models for AI misuse
Since consumer AI tools have exploded in popularity, fears of AI-based threats to security have moved from sci-fi to reality. The FBI warns that criminals are already using AI to hack financial networks, and OpenAI disrupted an Iranian government disinformation operation last year. But risks could rapidly escalate beyond theft and propaganda to truly catastrophic threats—from designing deadly viruses to hacking into critical infrastructure. Such threats pose a legitimate threat not only to AI users but to national security itself.
In response, proposals have emerged for mandatory monitoring and reporting mechanisms to prevent AI misuse. These proposals demand careful scrutiny. The Supreme Court typically protects reasonable expectations of privacy under the Fourth Amendment, and people may reasonably expect to use these new tools without fear of government surveillance.
Yet governments should not shy away from carefully designed oversight. AI labs likely already conduct some legal monitoring of their consenting users. In addition, U.S. law has several analogous frameworks—notably the Bank Secrecy Act and laws combating child sexual abuse material (CSAM)—that require private companies to record potentially illicit activity and/or make reports to authorities. These precedents show how reasonable monitoring regulation can help prevent crime while respecting privacy rights.
AI Misuse Risks
Artificial intelligence systems present various categories of potential catastrophic risks, ranging from unintended accidents to loss of human control over increasingly powerful systems. But we need not imagine a “Skynet” scenario to worry about catastrophic AI. Another kind of risk is simple misuse: bad actors who intentionally use AI to do dangerous and illegal things. This intentional misuse raises particularly salient privacy concerns, as mitigating it requires monitoring individual user behavior rather than just overseeing AI systems or their developers.
While AI might enable various forms of criminal activity, from copyright infringement to fraud, two categories of catastrophic misuse merit particularly careful consideration due to their potential for widespread devastation. First, AI could dramatically lower barriers to bioterrorism by helping malicious actors design and create deadly pathogens. Current AI models can already provide detailed scientific knowledge and laboratory protocols that could potentially be exploited for biological weapons development. Researchers have shown that current language models can already directly instruct laboratory robots to carry out experiments, suggesting that as AI advances, the capability to create deadly pathogens could become increasingly available to potential bad actors.
Second, AI systems may enable unprecedented cyber warfare capabilities that could threaten critical infrastructure and national security. A recent FBI threat assessment highlights how AI could enable sophisticated cyber-physical attacks on critical infrastructure, from manipulating industrial control systems to compromising autonomous vehicle safety systems. For instance, in 2017, the “Triton” malware attack targeted petrochemical plants in the Middle East, attempting to disable critical safety mechanisms. As capabilities improve, we may see fully autonomous AI systems conducting cyberattacks with minimal human oversight.
Government-mandated monitoring may be justified for AI risk, but it should not be taken lightly. Focusing specifically on the most serious threats helps maintain an appropriate balance between security and privacy.
Current Safety Measures
AI developers use various methods to prevent misuse, including “fine-tuning” models and filtering suspicious prompts. However, researchers have demonstrated the ability to “jailbreak” models and bypass these built-in restrictions. This capability suggests the need for a system of monitoring that allows developers to respond swiftly to initial cases of misuse by limiting the ability of the bad actor to engage in further misuse. AI providers may scan user interactions for patterns indicative of misuse attempts, flag high-risk users, and take actions ranging from warnings to imposing access restrictions or account bans.
These private monitoring efforts operate within a statutory framework that generally allows companies enough flexibility to monitor their services when necessary. The Electronic Communications Privacy Act (ECPA) restricts companies from accessing users’ communications, but contains several relevant exceptions—including consent, ordinary course of business activities, protecting the provider’s rights and property, and emergency disclosures. Technology companies typically seek to establish consent through their privacy policies (though the legal sufficiency of this approach is often questioned), and also have significant latitude to monitor communications when necessary to make their services function. The ECPA also permits disclosure to law enforcement with proper legal process, and allows emergency disclosures when providers reasonably believe there is an immediate danger of death or serious physical injury. Thus, AI providers already have legal pathways to share critical threat information with authorities, but are not under clear obligations to do so.
Incident Reporting
The shortcoming of purely internal monitoring is that malicious actors can migrate to other models after being banned or use multiple models to avoid detection. Accordingly, there is a need for centralized reporting systems to alert other developers of risks. Nonprofits like the Responsible AI Collaborative have begun to collect media reports of AI incidents, but documented real-world incidents likely represent only the tip of the iceberg. More importantly, focusing solely on successful attacks that caused harm misses the broader picture—AI providers regularly encounter suspicious behavior patterns, thwarted attempts at misuse, and users who may pose risks across multiple platforms.
One potential model for addressing these limitations comes from requirements for reporting child sexual abuse material (CSAM). Under 18 U.S.C. § 2258A, electronic service providers must report detected CSAM to the National Center for Missing and Exploited Children, but face no obligation to proactively monitor for such material. Generally, § 2258A has survived Fourth Amendment challenges under the “private search doctrine,” which holds that the Fourth Amendment protects only against government searches, not private action. While private entity searches can be attributed to the government when there is sufficient government encouragement or participation, circuit courts have rejected Fourth Amendment challenges to § 2258A because it requires only reporting while explicitly disclaiming any monitoring requirement. As the Ninth Circuit explained in United States v. Rosenow, “mandated reporting is different than mandated searching,” because communications providers are “free to choose not to search their users’ data.”
California recently considered a similar approach to reporting in SB 1047, one provision of which would have required AI model developers to report “artificial intelligence safety incident[s]” to the state Attorney General within 72 hours of discovery. While ultimately vetoed, this reporting-focused approach offers several advantages: it would create a central clearinghouse for incident data, facilitate coordination across competing AI labs, without imposing any direct obligations for AI companies to monitor their users.
A reporting-only mandate may paradoxically discourage active monitoring. If only required to report the problems they discover, some companies may choose not to look for them. This mirrors concerns raised during the “Crypto Wars” debates, where critics argued that encryption technology not only hindered third party access to communications but also prevented companies themselves from detecting and reporting illegal activity. For instance, while Meta reports CSAM found on public Facebook feeds, encryption is the default for channels like WhatsApp—meaning Meta can neither proactively detect CSAM on these channels nor assist law enforcement in investigating it after the fact.
AI companies might similarly attempt to move towards systems that make monitoring difficult. While most current commercial AI systems process inputs as unencrypted text, providers could shift toward local models running on users’ devices. More ambitiously, some companies are working “homomorphic” encryption techniques—which allow computation on encrypted data—for AI models. Short of retrieving the user’s device, these approaches would place AI interactions beyond providers’ reach.
Mandatory Recordkeeping
Given the limitations of a pure reporting mandate, policymakers might consider requiring AI providers to maintain certain records of user interactions, similar to bank recordkeeping requirements. The Bank Secrecy Act of 1970, passed to help law enforcement detect and prevent money laundering, provides an instructive precedent. The Act required banks both to maintain records of customer identities and transactions, and to report transactions above specified thresholds. The Act faced immediate constitutional challenges, but the Supreme Court upheld the Act in California Bankers Association v. Shultz (1974). The court highlighted several factors which overcame the plaintiff’s objections: the Act did not authorize direct government access without legal process; the requirements focused on specific categories of transactions rather than general surveillance; and there was a clear nexus between the recordkeeping and legitimate law enforcement goals.
This framework suggests how AI monitoring requirements might be structured: focusing on specific high-risk patterns rather than blanket surveillance, requiring proper legal process for government access, and maintaining clear links between the harm being protected against (catastrophic misuse) and the kinds of records being kept.
Unlike bank records, however, AI interactions have the potential to expose intimate thoughts and personal relationships. Recent Fourth Amendment doctrine suggests that this type of privacy may merit a higher level of scrutiny.
Fourth Amendment Considerations
The Supreme Court’s modern Fourth Amendment jurisprudence begins with Katz v. United States (1967), which established that government surveillance constitutes a “search” when it violates a “reasonable expectation of privacy.” Under the subsequent “third-party doctrine” developed in United States v. Miller (1976) and Smith v. Maryland (1979), individuals generally have no reasonable expectation of privacy in information voluntarily shared with third parties. This might suggest that AI interactions, like bank records, fall outside Fourth Amendment protection.
However, a growing body of federal case law has increasingly recognized heightened privacy interests in digital communications. In United States v. Warshak (2010), the Sixth Circuit found emails held by third parties deserve greater Fourth Amendment protection than traditional business records, due to their personal and confidential nature. Over the next decade, the Supreme Court similarly extended Fourth Amendment protections to GPS tracking, cell phone searches, and finally, cell-site location data. The latter decision, Carpenter v. United States (2018), was heralded as an “inflection point” in constitutional privacy law for its potentially broad application to various kinds of digital data, irrespective of who holds it.
Though scholars debate Carpenter’s ultimate implications, early evidence suggests that courts are applying some version of the key factors that the opinion indicates are relevant for determining whether digital data deserves Fourth Amendment protection: (1) the “deeply revealing nature” of the information, (2) its “depth, breadth, and comprehensive reach,” and (3) whether its collection is “inescapable and automatic.”
All three factors raise concerns about AI monitoring. First, if Carpenter worried that location data could reveal personal associations in the aggregate, AI interactions can directly expose intimate thoughts and personal relationships. The popularity of AI companions designed to simulate close personal relationships are only an extreme version of the kind of intimacy someone might have with their chatbot. Second, AI’s reach is rapidly expanding – ChatGPT reached 100 million monthly active users within two months of launch, suggesting it may approach the scale of “400 million devices” that concerned the Carpenter Court. The third factor currently presents the weakest case for protection, as AI interactions still involve conscious queries rather than automatic collection. However, as AI becomes embedded into computer interfaces and standard work tools, using these systems may become as “indispensable to participation in modern society” as cell phones.
If courts do apply Carpenter to AI interactions, the unique privacy interests in AI communications may require stronger safeguards than those found sufficient for bank records in Shultz. This might not categorically prohibit recordkeeping requirements, but could mean that blanket monitoring regimes are constitutionally suspect.
We can speculate as to what safeguards an AI monitoring regime may continue beyond those provided in the Bank Secrecy act. The system could limit itself to flagging user attempts to elicit specific kinds of dangerous behavior (like building biological weapons or hacking critical infrastructure), with automated systems scanning only for these pre-defined indicators of catastrophic risks. The mandate could prohibit bulk transmission of non-flagged conversations, and collected data could be subject to mandatory deletion after defined periods unless specifically preserved by warrant. Clear statutory prohibitions could restrict law enforcement using any collected data for purposes beyond preventing catastrophic harm, even if other incidental harms are discovered. Independent oversight boards could review monitoring patterns to prevent scope creep, and users whose data is improperly accessed or shared could be granted private rights of action.
While such extensive safeguards may prove unnecessary, they demonstrate how clear legal frameworks for AI monitoring could both protect against threats and enhance privacy compared to today’s ad-hoc approach. Technology companies often make decisions about user monitoring and government cooperation based on their individual interpretations of privacy policies and emergency disclosure provisions. Controversies around content moderation illustrate the tensions of informal government-industry cooperation: Meta CEO Mark Zuckerberg recently expressed regret over yielding to pressure from government officials to remove content during the COVID-19 crisis. In the privacy space, without clear legal boundaries, companies may err on the side of over-compliance with government requests and unnecessarily expose their users’ information.
Conclusion
The AI era requires navigating two profound risks: unchecked AI misuse that could enable catastrophic harm, and the prospect of widespread government surveillance of our interactions with what may become the 21st century’s most transformative technology. As Justice Brandeis warned in his prescient dissent in Olmstead, “The greatest dangers to liberty lurk in insidious encroachment by men of zeal, well meaning but without understanding.” It is precisely because AI safety presents legitimate risks warranting serious countermeasures that we must be especially vigilant in preventing overreach. By developing frameworks that establish clear boundaries and robust safeguards, we can enable necessary oversight while preventing overzealous intrusions into privacy rights.