America and China Can Make AI Safer

Christina knight and scott singer

0
06.04.2026

As artificial intelligence increasingly defines economic and strategic competition between the United States and China, the technology also creates extreme risks that transcend national borders. An individual could potentially use an AI model or a combination of models to engineer a dangerous pathogen, launch autonomous cyberattacks on power grids or hospital networks, or create and disseminate realistic deepfakes that erode public trust—regardless of whether that individual lives in Dalian, Dallas, or Delhi. Neither the United States nor China benefits from an AI race in which a model from either country could cause catastrophic harm anywhere.

Chinese models present particularly acute vulnerabilities. DeepSeek’s open-source large language model, R1-0528, for example, lacks many of the safeguards that are built into U.S. systems. It accepts malicious instructions 12 times more often than leading U.S. models do, according to U.S. government research. Its models are also significantly more vulnerable to attackers: standard jailbreaking methods—techniques to bypass a model’s built-in safety controls—elicit harmful responses 94 percent of the time versus just eight percent of the time for comparable American systems. This risk increases when a Chinese model powers many autonomous agents, such as the now viral OpenClaw, which can browse the web and access databases at scale without human oversight.

As the two dominant powers in transformative AI, Washington and Beijing will determine whether it creates widely shared benefits or generates dangerous new risks. When great powers develop high-risk technologies, open communication channels are essential to prevent misunderstandings that could lead to disaster. During the height of the Cold War, for example, U.S. scientists shared information with the Soviet Union about technologies to prevent unauthorized nuclear use. Deciding when to share information related to critical technologies requires careful discretion about what to disclose and what to withhold. But even the most intense rivals can find ways to effectively cooperate.

The United States and China must collaborate to manage the growing risks of AI while they compete for technological supremacy. A prudent U.S. risk mitigation strategy does not mean slowing down innovation. Instead, it means working with Beijing to come to an understanding of safety research priorities, to coordinate testing for vulnerabilities and implementing safeguards, and to jointly establish best practices to contain truly global risks. China, meanwhile, needs to invest in the technical capacity that makes engagement on AI safety worthwhile. Working together is necessary, and with the right approach, it is feasible. By focusing on how to look for risks rather than the specifics of what they find, Washington and Beijing can compete fiercely on AI while still mitigating the most extreme dangers it presents to the world.

Making AI safer requires a clear understanding of both the risks that the technology creates and the tools available to minimize them. Systematic assessments of frontier AI developments serve the same function as clinical trials do for new drugs and crash tests do for automobiles. They identify dangers before and during deployment to ensure that technological innovation does not cause preventable harm.

But AI systems differ from drugs and cars. They are general-purpose technologies that evolve continuously after deployment, can be repurposed by users in ways that developers never anticipated, and spread globally at unprecedented speed. Testing new systems before they are released is not enough to account for the unexpected ways in which AI capabilities can develop. This unpredictability is why it is urgent to continuously and rigorously screen for new risks and intervene in real time.

Both the United States and China have started to recognize the need for stronger safety practices across the AI supply chain. In the United States, a layered system is taking shape. It starts with leading AI companies, which have vast technical resources to assess the potential for extreme contingencies and to adjust their models accordingly. Third-party evaluators with independent credibility and expertise can test these models. Safety also extends to other applications that use AI. An array of independent organizations are building tools that can be tailored to intercept harmful content in specific AI-powered applications, such as coding assistants or tutoring bots. The government can also contribute to these efforts. U.S. government bodies, such as the Department of Commerce’s Center for AI Standards and Innovation and its partner AI Safety Institutes, can leverage findings and technical tools from both developers and independent safety organizations to craft more informed policies and standards and fix vulnerabilities before models are publicly released.

AI competition does not preclude common safety baselines.

China lacks the same technical infrastructure to measure and minimize catastrophic risks. Beijing has historically prioritized what it calls content security, or ensuring that AI systems do not generate politically sensitive or ideologically undesirable content. As a result, both regulators and companies in China have focused on ensuring that models align with Chinese Communist Party priorities. But this narrow focus on social and political control is widening. China’s leaders are now paying attention to the broader risks posed by AI. In February, for example, China’s cyberspace agency proposed a policy to regulate interactions with humanlike AI, which reflects concerns about the human harms of addiction and dependence.

Rhetoric about AI risks is still more widespread than concrete policy action, but Beijing is starting to take meaningful steps. In September 2025, Chinese authorities published an updated AI safety governance framework. The framework echoes many concerns that have been reverberating across Silicon Valley, including how AI could reduce barriers to developing chemical, biological, or nuclear weapons and the possibility that it could replicate itself to a point beyond human control. The framework also warns that open-source foundation models, which dominate China’s AI ecosystem, make it easier for AI misuse to proliferate. National laboratories are testing the most advanced AI models to look for potential dangers, too. In July 2025, for instance, the Shanghai AI Lab—a major state-backed research institution focused on AI development—evaluated 18 large language models across seven frontier AI risk areas. It identified biological and chemical risks in most models, and found warning signs that several models would engage in strategic deception—observations that mirror leading American AI developers’ assessments of safety challenges in both the United States and China.

China increasingly recognizes that poor risk management could hinder its AI ambitions. But to substantially reduce potential dangers, Beijing must continue to expand these efforts. Cooperation on risk management is only as valuable as the technical infrastructure that both sides bring to the table.

Although bridging the divide between Washington and Beijing is challenging, it is not impossible. As with aviation, in which the U.S. firm Boeing and the European company Airbus compete commercially while adhering to shared safety standards, AI competition does not preclude common safety baselines. The United States and China can work in tandem with leading scientists and laboratories to establish a shared understanding of AI risks. Although what constitutes a risk will inevitably be interpreted through differing legal and cultural frameworks, the two sides can agree on a subset of severe global dangers that unambiguously threaten both countries, as well as an array of general technical solutions that they can flexibly deploy.

This shared baseline should build on existing global efforts. The International AI Safety Report, for example, offers an authoritative assessment of the state of the science underpinning advanced AI. Joint AI safety tests, such as those conducted by the United Kingdom and the United States, have also identified common emerging risks and gaps in technical capacity. Ongoing Track II, or unofficial, AI dialogues between the United States and China also enable experts to identify areas of convergence and disagreement.

The case of another area of frontier science—gene editing—illustrates the dangers of the United States and China not being aligned on shared risks and the promise of improved safety when they are. In 2018, the Chinese scientist He Jiankui announced that he had secretly edited the genes of newborn twins to try to make them resistant to HIV. This action quickly sparked global scientific backlash amid concerns about the potential dangers of genetic manipulation of the human germline, the cells that pass genes down to offspring. The Chinese government, which originally hailed the scientific breakthrough, backtracked and punished He.

In response, international bodies encouraged China to develop bioethical standards. A bipartisan group of U.S. senators, for instance, introduced a resolution promoting bilateral cooperation to prevent a race to the bottom that could lead to other incidents similar to what happened with He. China did not dismiss such efforts; instead, it worked to upgrade its bioethics regulations to align with international standards. Beijing introduced new regulations in its civil code to outlaw unauthorized germline editing and require central government approval for any human gene-editing research. Chinese officials also established a national scientific ethics committee. At least on paper, many of China’s biosafety ethics rules are now in line with those in Europe and North America.

PRACTICING SAFE SHARING

But a willingness to establish a shared understanding of risk is not enough to prevent global harm. The United States and China must work together to set up technical best practices for reducing the risks of AI models. They must walk the same tightrope that the United States and the Soviet Union did with nuclear technologies during the Cold War: sharing enough information to mitigate global harms while protecting proprietary information and trade secrets. In practice, this means cooperating on how to test AI systems for dangerous capabilities and how to build safeguards that reduce the risks those tests reveal.

On testing, a priority for both countries is how to structure so-called red teams, which are groups that probe safety controls to try to expose vulnerabilities in an AI model. Both countries need to discuss how to set up experiments and how to scale red-teaming techniques to respond to a wider array of threats. Their methods could include using other large language models instead of human experts and probing for risks that emerge when AI models are embedded in applications that perform tasks such as writing code or browsing the Internet autonomously. But as they share best practices, the two countries should refrain from revealing specific tactics, which could enhance a rival’s own AI development.

Testing must also extend beyond the digital sphere. In what are known as wet-lab proxy studies, AI systems assist humans in live or simulated environments to assess the real-world risks of AI-enabled activity. A wet-lab proxy study could place researchers with varying levels of expertise in a controlled biosafety facility and measure whether AI-guided assistance enables them to synthesize dangerous compounds faster or more accurately than they could using published literature alone. Countries should discuss how to conduct these tests safely, which includes determining proper lab security protocols and the use of harmless experiments that can stand in for dangerous ones, without revealing the specific method experts use to elicit help from AI models.

Neither country is equipped to fully defend against sophisticated AI misuse.

By focusing on how to run safety tests instead of on their content, technical collaboration can avoid disclosing sensitive information or giving up one side’s hard-fought advantages. The United States and China can thus set up these best practices without having to include the exact methods they use to try to get models to misbehave, the strategies that can encourage models to reveal sensitive data, or the precise biological or chemical knowledge needed for AI knowledge to reach an unsafe threshold.

Beyond testing, the two countries need to cooperate on safeguards, which are the technical mechanisms introduced during and after model development to reduce risk. Current safeguards are inadequate. They fail when they are pitted against deliberate, sophisticated attacks, yet they also block too many legitimate requests. They might refuse to help cybersecurity researchers or biodefense scientists access the information they need, for instance, if they mistakenly perceive that these researchers are trying to cause harm. Both countries would benefit from developing tools that can distinguish between legitimate and dangerous uses of AI models and that can govern the risks that arise downstream from the AI model itself.

Collaboration on safeguards is feasible if it focuses on the external tools that shape how models behave after deployment. The United States and China can discuss content filters, execution guardrails, and usage restrictions, for example, without exposing how their models are built. Approaches that require access to a model’s inner workings risk revealing proprietary methods that could enhance a rival’s capabilities. These should remain off-limits for dialogue. But even talking about general approaches would represent meaningful progress because neither country is currently equipped to fully defend against sophisticated AI misuse.

THE ROOM WHERE IT HAPPENS

The first U.S.-Chinese official government dialogue on AI, held in Geneva in May 2024, failed because there was a mismatch in expertise and priorities. The United States sent technical experts from inside the government, while China brought diplomats more involved in foreign policy concerns and chip controls. In other words, Washington was focused on technical risk, whereas Beijing was focused on political risk.

What is needed now is a narrow, stable conversation on global AI risks divorced from the ups and downs of the broader U.S.-Chinese bilateral relationship. Such a discussion is in the selfish interests of both countries. But getting the best people in the room to have this discussion requires creativity. One promising approach is to involve experts connected to but situated outside government, such as individuals from the China AI Safety and Development Association, a domestic network of institutions that includes Tsinghua University, the hub of China’s AI safety research, and Shanghai AI Lab. Bringing together researchers with both deep technical understanding and proximity to government power would anchor the dialogue in a shared vocabulary and focus official government discussions.

Other countries with technical know-how and good diplomatic relations with both the United States and China can help sustain these conversations. The United Kingdom, for instance, can leverage the unique technical expertise in its AI Security Institute to discuss AI risk management with each of the superpowers. Such efforts can bridge U.S.-Chinese divides when bilateral tensions rise.

A successful dialogue will help policymakers gain insight into dangerous activity without stifling innovation. Ultimately, investing in cooperation will help both countries detect risks from new models, improve AI safeguards, and promote transparency among companies, governments, and international users on what they know—and even more important, on what they don’t know—about emerging AI capabilities. Only by working together can the United States and China understand and mitigate the global AI risks that threaten them both. Prudent conversations now could prevent catastrophic harm later.

You are reading a free article

Subscribe to Foreign Affairs to get unlimited access.

Paywall-free reading of new articles and over a century of archives

Six issues a year in print and online, plus audio articles

Unlock access to the Foreign Affairs app for reading on the go

Already a subscriber? Sign In

visit website

Popular

America and China Can Make AI Safer

Christina knight and scott singer

© Foreign Affairs