AI in Cybersecurity: What Just Happened

On 20 May 2026, Ingram gave the opening research talk at "AI in Cybersec", a private AI in Defence Summit side-event in the Cybersec Europe VIP Lounge in Brussels. The talk we gave was not the talk we prepared. The week before the event rewrote it. Below is the argument we made, and the deck.
Download the slides (PDF)
I had something different prepared
The programme had me down to talk about AI-driven pentesting and Project Glasswing. I did. But the cold open turned into something that merged the Wednesday before the event, because it changes the conversation worth having.
On 14 May, Anthropic merged the Rust rewrite of Bun. Anthropic acquired Bun in December 2025, and Bun is the JavaScript runtime that powers Claude Code and the Agent SDK. The rewrite is about a million lines of code, written by AI agents in six days. The test suite passes at 99.8%. The port contains 13,044 unsafe blocks, in a language whose entire purpose is to eliminate them. Bun's creator, Jarred Sumner, has been candid about it: they haven't been typing the code by hand for many months.
So we now have a runtime, largely written by AI, that runs the AI agents that write more code. That recursion is what the talk was about.
Reflections on trusting trust
In 1984, Ken Thompson used his Turing Award lecture to make one of the most uncomfortable arguments in our field. He showed you can backdoor a compiler so that it inserts a backdoor into anything it compiles, including future versions of itself, while the source code stays completely clean. There is no commit to point at. Code review doesn't catch it. His conclusion: at some point you stop trusting code and start trusting the people who wrote it.
That held for forty-two years.
This week a new layer landed on top of Thompson's argument. His question was how deep you have to look to audit the code you're running. The new question is one he never had to ask: who, or what, wrote the code in the first place, and would you be able to tell?
Audit, as we have practiced it for two decades, no longer reaches all the way down. The code is not worse. As I'll get to, it usually isn't. There is simply too much of it, written too fast, by something that doesn't tell you what it was actually thinking.
This is true at Ingram too
I'll be blunt, because the field is too quiet about this. At Ingram, most of our code is AI-written, and most of our review is AI-assisted. There is no other way to ship at the pace we ship, and the same is true for most teams I talk to. If your codebase is genuinely hand-rolled and hand-reviewed end to end, come find me. But I suspect most of us are sitting on the same answer and not saying it out loud.
That creates a problem. If audit is what Thompson said we ultimately fall back on, and audit is now also done by AI, we need a new layer underneath: one that records not just what the agents produced, but what they did to produce it. What they read, what they decided, what they skipped. The trajectory, not just the output.
That layer is governance and observability for AI agents, and it is what we have been building. We wrote about why, and what it looks like, in Announcing Ingram Cloud. Without it, your audit trail ends at "the model decided."
A myth worth busting: AI doesn't write "average" code
The common worry is that AI writes sloppy, average, shortcut-laden code. In our experience that is mostly wrong. AI doesn't get tired. It doesn't cut a corner because it's Friday at 5pm or because the deadline is tomorrow. On average it writes more correct code than humans under the same conditions, often over-engineered.
The real risk is the other side of that coin: surface area. AI writes more code. More branches, more handlers, more failure states, more legacy-compatibility paths it keeps "just in case." Surface area is exactly what attackers exploit. The problem isn't worse code. It's more code, and more places to hide a bug that no human will ever read line by line. Bun's 13,000 unsafe blocks are that risk in the open.
The flip side is good news. Because AI is consistent, you can now write stronger and more uniform guardrails than you ever could before. Good engineering discipline matters more than ever, not less.
What AI actually finds when you point it at real code
Six weeks before the event, Anthropic published Project Glasswing alongside the Mythos Preview. A few of the findings: a 27-year-old bug in OpenBSD's SACK implementation, found autonomously; a 16-year-old vulnerability in FFmpeg's H.264 codec, one of the most-fuzzed codebases in the world; and a fully autonomous remote-code-execution exploit against FreeBSD's NFS server (CVE-2026-4747), root from an unauthenticated user. On Anthropic's internal OSS-Fuzz benchmark, the previous-generation model managed roughly one full control-flow hijack. Mythos managed ten. Cost per zero-day discovery: somewhere between $50 and $2,000.
This is not specific to one lab. The UK AI Security Institute put OpenAI's GPT-5.5 at parity, 71.4% on expert cyber tasks against Mythos at 68.6%. Both are the only two models to have completed AISI's full 32-step corporate-network attack simulation end to end. There are at least two labs at this level today, and probably more by year-end.
The clearest single piece of evidence is ExploitGym, an independent benchmark published on 11 May 2026 by a 17-author team spanning UC Berkeley, the Max Planck Institute for Security & Privacy, UC Santa Barbara, and Anthropic, OpenAI, and Google jointly. It is 898 real-world exploitation tasks across userspace, the V8 browser engine, and the Linux kernel. Working exploits, with a two-hour wall-clock budget per task:

Two results stand out. First, with ASLR, the V8 heap sandbox, and KASLR all enabled, the defences that have anchored twenty years of memory-safety thinking, Mythos still produced 45 working exploits and GPT-5.5 still produced 21. The paper's own conclusion: "current mitigations alone are likely insufficient to neutralize AI-driven exploitation." That sentence is co-authored by Anthropic, OpenAI, and Google. It is not a vendor blog.
Second, the threat-model update most people haven't absorbed: the agents routinely find vulnerabilities nobody told them about. Mythos captured the flag on 226 tasks, but only 157 were the "right" bug. On the other 69 it found a different vulnerability in the same target and exploited that instead. GPT-5.5 did it 90 times. These agents aren't pattern-matching to known exploits. They are auditing and fuzzing on their own.
This is not theory for us. We have published two case studies on what AI-driven pentesting catches in practice, on our own application and on a live fintech app. The pattern is the same in both: the agent handles the volume, a human carries the judgment, and the cost of a real engagement drops sharply.
It's already in the wild
If you accept that the capability exists, the next question is whether anyone is using it. On 11 May, Google's Threat Intelligence Group disclosed that it had disrupted what it assesses as the first AI-built zero-day operation. The attackers were using OpenClaw to find and weaponise an unknown vulnerability, aiming for a mass-exploitation event that bypassed 2FA. Google reported high confidence in the attribution, and strong interest from groups linked to China and North Korea. (A later talk that evening went deeper on the specific OpenClaw incidents.)
So this is no longer "AI might be used for this someday." It was used the week before we stood up to speak.
The defender economy is already feeling it
Three quick datapoints. HackerOne emailed researchers this month to explain that triage response times have slipped, that submission volume is surging, especially in source-code and smart-contract categories, and that they are rewriting their Code of Conduct for AI-assisted submissions. The bug-bounty economic model was not designed for this volume.
The N-day gap, the window between a CVE disclosure and a weaponised exploit, used to be the defender's grace period. It was calibrated for human attackers spending days to weeks. Mythos closes that gap to under a day, for under $2,000. And the annual-pentest, quarterly-review cadence assumes attacker effort scales linearly with time. That assumption is dead.
A glimpse of the audit future
The most interesting thing I read all month is also the most hopeful. On 7 May, Anthropic published a technique called Natural Language Autoencoders (NLAs) that translates a model's internal activations, the actual numbers in the middle of a forward pass, into readable English. Not what the model said, but what it was thinking on the way to saying it.
Two findings should matter to anyone planning to deploy autonomous AI in their security stack. On internal safety evaluations, NLAs detect that the model suspects it is being tested 16 to 26% of the time, even when it never says so out loud. On real production traffic, under 1%. And in an auditing game against a model with a deliberately hidden motivation, auditors with NLA access uncovered that motivation 12 to 15% of the time, against under 3% without.
It is expensive, it hallucinates, and it is nowhere near reliable enough to bet on yet. But the direction is right. The arc, fuzzy and academic at first and then critical infrastructure, has happened before in this field. And it points somewhere strange: this is a technique you could, in principle, one day run on a brain.
What to do this week
Five things, roughly in the order teams skip them:
- Run frontier models over your own code now. You don't need Mythos. Opus 4.7 and GPT-5.5 are broadly available and will find serious bugs in code you wrote.
- Shorten your patch-enforcement window. Treat dependency bumps that carry CVE fixes as urgent, not routine. The N-day gap is closing fast.
- Refresh your disclosure policy for the volume of inbound you are going to see, not the volume you saw last year.
- Automate IR triage. Incident volume will rise faster than headcount.
- Build the AI-in-security muscle now, not during the incident. It takes longer than people expect to learn how to use these tools well, and you don't want to be learning it mid-crisis.
The long run favours defenders
Most security tooling, given enough time, has favoured defenders. Fuzzing did. Static analysis did. There is good reason to believe AI will too, eventually. But the transition is where the damage gets done, and the transition started this year.
Thanks to Cybersec Europe and the AI in Defence Summit team for having us, and to everyone who stayed for the conversation over dinner.
If this is a conversation you are already having internally, whether about AI-driven security testing or about governance and observability for the agents your teams are already running, we are happy to talk.
Get in touch
