ChatGPT-5.5 Is As Dangerous As Anthropic Mythos

Sam Altman called Anthropic's restricted AI rollout fear-based marketing. Then he did exactly the same thing.

Li Nguyen

The UK’s AI Safety Institute found GPT-5.5 matches Anthropic Mythos on cybersecurity benchmarks. OpenAI then restricted access to its Cyber model — the exact move Sam Altman called “fear-based marketing” when Anthropic did it. Meanwhile, a new study found that most cybercriminals can’t even use AI properly.


GPT-5.5 cybersecurity capabilities match Anthropic Mythos on key benchmarks — and that finding has quietly set one of AI’s most heated debates. The UK AI Security Institute (AISI) published evaluation results confirming that OpenAI‘s latest general-purpose model achieved a “similar level of performance” to Anthropic‘s closely guarded Claude Mythos Preview in controlled cybersecurity tests. The results appeared on 29 April 2026 — the same week OpenAI CEO Sam Altman announced a restricted rollout of its own cybersecurity model, GPT-5.5-Cyber. Altman had previously called Anthropic’s restricted Mythos release “fear-based marketing.” He then deployed the identical strategy himself.

The third piece of the puzzle is counterintuitive. A peer-reviewed study from the University of Edinburgh, published on 5 May 2026, analysed more than 100 million forum posts from underground cybercriminal communities. Its conclusion: most hackers have tried AI tools and come away disappointed. The technology is not giving them meaningful new capabilities. The people who benefit most from AI coding assistance are those who can already code. That finding is in direct tension with the alarm surrounding both GPT-5.5 and Mythos.

What’s Happening & Why It Matters

GPT-5.5 Cybersecurity Capabilities: The Benchmark Numbers

The AISI evaluated GPT-5.5 against a suite of 95 narrow cybersecurity tasks across four difficulty tiers. The results placed GPT-5.5 at an aggregate expert task score of 71.4%. Anthropic Mythos Preview scored 68.6% on the same evaluation framework. That gap — 2.8 percentage points — is real but narrow. Neither model dominates the other decisively on the benchmark.

The most telling data point is a specific test called TLO — a complex, multi-step penetration testing simulation. Anthropic Mythos completed TLO end-to-end in 3 of 10 attempts. GPT-5.5 completed it in 2 of 10 attempts, at a 100 million token budget per attempt. On that single task, Mythos holds a small advantage. On reverse-engineering speed and cost, GPT-5.5 leads — completing a malware reverse-engineering task in 10 minutes for $1.73 (€1.59).

The AISI‘s interpretation influences as much as the numbers. The institute stated that GPT-5.5’s cybersecurity capabilities appear to be “a byproduct of more general improvements in long-horizon autonomy, reasoning, and coding” rather than a breakthrough specific to Mythos. In other words, dangerous cybersecurity capabilities are not a unique feature of any one model. They are an emergent property of frontier AI in general.

GPT-5.5-Cyber: OpenAI Adopts the Playbook It Criticised

The GPT-5.5 cybersecurity benchmark results reached the public on the same day OpenAI CEO Altman announced a restricted rollout of GPT-5.5-Cyber — a specialised version of GPT-5.5 built for cybersecurity work. Altman posted on X: “We’re starting rollout of GPT-5.5-Cyber, a frontier cybersecurity model, to critical cyber defenders in the next few days. We will work with the entire ecosystem and the government to figure out trusted access for cyber; we want to help secure companies/infrastructure rapidly.”

Access requires a formal application. Eligible users must demonstrate legitimate cybersecurity credentials and planned defensive use cases. The programme — called Trusted Access for Cyber (TAC) — is tiered. A TAC spokesperson confirmed it had scaled to “thousands of verified defenders and hundreds of teams responsible for protecting critical software.” Critical defenders gain access to GPT-5.5 with reduced safeguard friction for specific cybersecurity tasks. Those tasks include penetration testing, vulnerability identification and exploitation, and malware reverse engineering.

That description matches exactly how Anthropic positioned Mythos under Project Glasswing. The irony is total and undeniable. Altman called Anthropic‘s restricted Mythos release “fear-based marketing” in April. He then built an identical restricted-access programme and released it two weeks later. As TechCrunch noted, the language OpenAI uses to justify TAC mirrors the language Anthropic used to justify Project Glasswing. US AI Czar David Sacks had previously accused Anthropic of “using fear as a marketing tool” but conceded the cybersecurity findings were “more on the legitimate side.” That concession applies to OpenAI equally.

The GPT-5.5 Safety Classification: “High” but Not “Critical”

OpenAI‘s own preparedness framework rates cybersecurity risk across two thresholds. The “Critical” threshold describes models capable of “unprecedented new pathways to severe harm” — specifically, autonomous zero-day exploit development at scale. GPT-5.5 does not cross that threshold. Instead, the model sits at the “High” classification — capable of amplifying “existing pathways to severe harm.” GPT-5.5 makes existing attack methodologies faster and more accessible. It does not appear to unlock entirely new attack categories on its own.

OpenAI VP of Research Mia Glaese confirmed the safety assessment. “GPT-5.5 underwent extensive third-party safeguard testing and red teaming for cyber and bio risks, and we’ve been iterating on our cyber safeguards for months with increasingly cyber capable models,” she said. The “High” rating places GPT-5.5 in the same risk category that Anthropic had attached to its model before Mythos — making the convergence between the two labs all the more pointed.

GPT-5.5 vs. Mythos: Access, Deployment, and Real-World Reach

The benchmark scores tell one story. The access reality tells another. Mythos access is restricted to approximately 50 organisations. The White House has actively opposed expansion to 120 organisations — a position that effectively makes Mythos the most powerful AI cybersecurity tool, almost no one can use. GPT-5.5-Cyber is rolling out to “thousands” through TAC. It also has API access scheduled for enterprise deployment. On the deployment trajectory, GPT-5.5 is more likely to reach production security workflows in the near term.

The Pentagon dimension adds further context. On 1 May 2026, the Department of Defense signed classified AI agreements with seven companies: SpaceX, OpenAI, Google, Microsoft, AWS, Nvidia, and Reflection AI. Anthropic was not on that list. Anthropic is a Pentagon “supply chain risk” after refusing to remove restrictions on autonomous weapons from its military contract terms. GPT-5.5-Cyber is entering government and defence workflows. Mythos is not.

The Cybercriminal Reality: Disappointed by AI

Against the backdrop of high-level alarm about GPT-5.5’s cybersecurity capabilities, the University of Edinburgh study published on 5 May 2026 offers a significant counterpoint. Researchers analysed more than 100 million forum posts from underground cybercriminal communities using the CrimeBB database — which scrapes data from dark web forums. Both manual analysis and a large language model processed the dataset.

The findings are striking. Cybercriminals have expressed significant interest in AI tools. In practice, those tools have not meaningfully changed their methods. “Many of the reviews and discussions describe AI tools as not particularly useful,” the study reads. Researchers found “no significant evidence” that hackers achieved any improvement in their hacking activity using AI — either as a learning tool or in developing more effective attack capabilities. The study concludes that AI coding assistants primarily benefit people who can already code. Hackers who lack foundational coding skills receive no meaningful boost from AI assistance. In other words, AI makes skilled attackers marginally more efficient. It does not turn unskilled actors into skilled ones.

The Dual-Use Dilemma Nobody Has Solved

The three stories together define the central challenge in AI cybersecurity policy. GPT-5.5’s cybersecurity capabilities and Mythos match each other at the frontier. Both can help security professionals find vulnerabilities faster, reverse-engineer malware, and test infrastructure at scale. The same capabilities help attackers do the same things in the opposite direction. That dual-use reality is not new — every security tool is dual-use. The new element is accessibility and cost. Malware reverse-engineering that once took a skilled analyst hours costs $1.73 and 10 minutes.

Europol‘s IOCTA 2026 report — published 28 April 2026 — states the systemic challenge outright. The agency identifies a growing “velocity gap” between law enforcement and cybercriminals. Criminals use AI to automate attacks, personalise scams, and reduce operational timelines “from weeks to hours.” Law enforcement adaptation lags. Europol Executive Director Catherine De Bolle put it directly: “Cybercriminals are rapidly exploiting advanced technologies, particularly AI tools, to enhance the speed, efficiency, and scope of their illicit activities. The tools not only enable automation in criminal processes but blur the lines between legitimate and malicious uses of technology.”

The Edinburgh study complicates that picture somewhat. Europol’s concern is about criminal networks at scale — large, coordinated, well-resourced operations that can integrate AI tools into industrial fraud pipelines. The Edinburgh study looked at individual hackers on underground forums, most of whom lack the skills to exploit AI effectively. Both findings can be true simultaneously. The most dangerous actors — nation-state groups, sophisticated ransomware operations, organised fraud networks — likely benefit from AI in ways that individual script kiddies do not.

TF Summary: What’s Next

The AISI will continue evaluating frontier models as new versions are released. The finding that GPT-5.5 cybersecurity capabilities match Mythos on key benchmarks is likely to accelerate calls for a standardised, global evaluation framework — one that applies across all frontier labs rather than relying on individual company assessments. The Frontier Model Forum — which OpenAI, Anthropic, Google, and Microsoft all participate in — has reportedly launched a joint initiative to address AI distillation and offensive capability proliferation. That initiative will need to move faster than the models it is trying to govern.

MY FORECAST: For OpenAI, the immediate priority is scaling TAC to the enterprise customers, government agencies, and security researchers it wants to reach before the next model generation arrives. GPT-5.5 Pro — the higher-capability variant — is already available to Pro, Business, and Enterprise ChatGPT subscribers. API access for GPT-5.5 is rolling out in waves. The cybersecurity capability race is not slowing down. The Edinburgh study is a useful corrective to panic — most criminal actors are not benefiting from the tools in the ways the most alarming scenarios suggest. That reassurance does not extend to the sophisticated actors who can. And those actors are precisely who the TAC and Project Glasswing programmes are trying to outrun.


[gspeech type=full]

Share This Article
Avatar photo
By Li Nguyen “TF Emerging Tech”
Background:
Liam ‘Li’ Nguyen is a persona characterized by his deep involvement in the world of emerging technologies and entrepreneurship. With a Master's degree in Computer Science specializing in Artificial Intelligence, Li transitioned from academia to the entrepreneurial world. He co-founded a startup focused on IoT solutions, where he gained invaluable experience in navigating the tech startup ecosystem. His passion lies in exploring and demystifying the latest trends in AI, blockchain, and IoT
Leave a comment