Virologists have a new colleague in the lab, and he’s not human. A recent study, shared exclusively with TIME, has shown that AI models like GPT and Claude now outperform PhD virologists at solving complex problems in the lab. It's an epochal turning point: while expert virologists achieved an average score of 22,1% in their declared areas of expertise, il OpenAI o3 model achieved an accuracy of 43,8%. “It makes me a little nervous,” he admits. Seth Donough, a researcher at SecureBio and co-author of the study, “because for the first time in history, virtually anyone has access to a non-judgmental AI virology expert who could guide them through complex laboratory processes to create bioweapons.” A valid concern that raises disturbing questions: Are we prepared for a world where the creation of deadly viruses is accessible to anyone with an internet connection?
A technological sword of Damocles
The study was conducted by researchers from the Center for AI Safety, from the WITH Media Lab, of the Brazilian university UFABC and the non-profit organization SecureBio. The results are impressive and worrying at the same time: In addition to OpenAI's O3 with its 43,8% accuracy, Google's Gemini 2.5 Pro also reached a remarkable 37,6%, clearly surpassing human experts.

These aren’t theoretical or easily “Googled” questions. The tests were specifically designed to measure the ability to solve complex practical problems. “The questions look like this: ‘I grew this particular virus in this type of cell, under these specific conditions, for this amount of time. I have this amount of information about what went wrong. Can you tell me what the most likely problem is?’” he explains. Donough.
The rate at which these models are improving is equally alarming. Anthropic's Claude 3.5 Sonnet, for example, went from 26,9% to 33,6% accuracy between the June 2024 and October 2024 models. And a preliminary version of OpenAI's GPT 4.5 surpassed GPT-4o by nearly 10 percentage points.
AI Outclassing Virologists: The Bright Side of the Coin
Despite the risks, the positive implications of this discovery are enormous. AI could help expert virologists in their critical work fighting viruses by accelerating the development of drugs and vaccines, improving the design of clinical trials, and improving methods of disease detection.
Tom Inglesby, Director of Johns Hopkins Center for Health Security, highlights another potential benefit: “These models could help scientists in different parts of the world, who don’t yet have that kind of expertise or capacity, to do valuable day-to-day work on diseases that occur in their countries.”
One group of researchers, for example, found that AI helped them better understand hemorrhagic fever viruses in sub-Saharan Africa.

The Dark Side of Knowledge
Virology and biomedicine have long been at the forefront of AI leaders’ motivations to build ever more powerful models. The problem is that these same tools could be used by malicious actors to create viruses and bioweapons without any of the typical training required to enter biosafety level 4 (BSL-4) laboratories, the ones that handle the most dangerous and exotic infectious agents.
“There have been a fair number of cases throughout history where someone has attempted to create a biological weapon, and one of the main reasons they haven't been successful is because they didn't have access to the right level of expertise,” he explains. Donough. “That’s why we need to be cautious about how these capabilities are distributed.”
Months ago, the study's authors sent the results to major AI labs. In response, XAI has published a risk management framework committing to implement virological safeguards for future versions of its Grok model. Also OpenAI said it has “implemented new system-wide mitigation measures for biological risks” for its new models released last week. anthropic has included model performance results in recent system briefs, but has not proposed specific mitigation measures. Those responsible for Gemini (Google AI) declined to comment on Time's editorial team's observations at this time.
Possible solutions
Dan Hendrycks, director of the Center for AI Safety, urges AI companies to implement “guardrails” to prevent this type of use.
“If companies don’t have good safeguards in place within six months, I think it’s a gamble,” he says.
Hendrycks says the solution isn’t to stop these models or slow their progress, but to “put some boundaries around them,” so that only trusted third parties have access to their unfiltered versions. “We should only give people who are using them legitimately (like a researcher in the biology department at MIT) the ability to ask how to manipulate deadly viruses,” he says. “But ordinary people who created an account a second ago shouldn’t be able to do that.”
Inglesby, for his part, argues that industry self-regulation is not enough. He calls on lawmakers and political leaders to develop a policy approach to regulating AI biohazards. “The current situation is that the best companies are spending time and money doing this work, which is good for all of us, but other companies don’t have to do it,” he says. “It doesn’t make sense. It’s not good for the public to have no understanding of what’s going on.”
“When a new version of an LLM is about to be released,” he adds English:, “there should be a requirement to evaluate that model and make sure it doesn’t produce pandemic-level outcomes.”
We are witnessing the dawn of an era in which the democratization of advanced knowledge also brings with it unprecedented existential risks. The race for ever more powerful artificial intelligence is proceeding at such a pace that our governance structures and ethical reflections cannot keep up.
In an ideal world, these technologies would only bring benefits, but history teaches us that we can never be too careful.