Defenders Finally Have the Edge - PentesterLab's Blog

31 Mar 2026 · 8 min read

Everyone is panicking about AI-generated zero days. They should be paying attention to the other side of the equation.

Anthropic recently showed Claude generating hundreds of validated high-severity vulnerabilities using a simple loop. Take every source file in a project, ask the model to find an exploitable bug, verify the result with a second pass. Thomas Ptacek wrote a whole piece arguing that vulnerability research as a human discipline is effectively over. The conversation is everywhere right now.

They're not wrong about the capability. But the conclusion most people are drawing, that attackers just got a massive upgrade, misses something important.

What Agents Actually Change

Finding a zero day has always required three things: expertise, luck, and resilience. You need to understand the code. You need some luck, because bugs hide in weird places. And you need the patience to go down rabbit hole after rabbit hole, finding nothing, and keep going. The best exploit developers make their own luck by combining all three. Anyone can find a zero day with proper training. That's literally why PentesterLab exists. But doing it at scale required all three, usually at the same time.

Agents collapse two of those. They don't get bored. They take infinite rabbit holes. Luck and resilience stop being bottlenecks.

The limiting factor is no longer finding bugs. It's knowing what matters. Agents will hand you a pile of findings. The skill is sorting signal from noise. That part hasn't been automated. Yet.

But here's where it gets interesting.

The Constraint Nobody Is Talking About

Offensive research teams that sell exploits work under extreme secrecy. They can't share their targets with anyone. They can't send proprietary code through an API. Many work in air-gapped networks with no internet access. They can't use Claude. They can't use GPT. They're stuck with whatever open-weight models they can run locally. Smaller context windows. Weaker reasoning. No cross-project learning. Always months behind the frontier.

Defenders are usually far less constrained. Or at least, they can decide to be. It's a risk-management decision.

You can point Claude at your own codebase right now. No air gap. No secrecy problem. Full context window. No restrictions on what you share, because it's your code.

For the first time, the person with the most to lose gets access to the best tools.

Breaking One Link Is Enough

Modern exploits are not single bugs. They're chains. A memory corruption to get initial control. A sandbox escape to break out of isolation. A privilege escalation to reach the kernel. A persistence mechanism to survive a reboot. Every link has to hold, or the whole thing falls apart.

Attackers need the full chain. Defenders only need to break one link.

This has always been true in theory. In practice, defenders never had the bandwidth to systematically audit every layer. You'd harden what you could and hope the rest held. The chain survived because nobody had time to inspect every link.

That constraint is gone. Run your agent against the sandbox layer this week. Run it against the IPC boundary next week. Run it against your privilege boundaries the week after. Each pass that finds and fixes a weakness removes a link the attacker was counting on. Do it on a regular cycle and you're not just patching known bugs. You're degrading the attacker's ability to build a working chain at all. And making it almost impossible for attackers to keep a reliable chain working over time across multiple versions of your application. The complexity for attackers grows exponentially.

The attacker has to get everything right. You only have to get one thing a bit better, repeatedly.

And here's the best part: as a defender, you don't even need to prove exploitability. That's the attacker's problem. You don't need to convince anyone it's worth fixing. If the code looks fragile, make it less fragile. If a pattern keeps showing up in findings, rewrite it. You just make the code a little bit better every day. Do that consistently and you're breaking links in chains that haven't been built yet.

So Do Something About It

This is not theoretical. I've been running exactly these kinds of workflows against real code with Claude. Take the same approach Anthropic described and run it defensively. Loop through your source files. Ask the model to find weaknesses. Fix what comes back. Rewrite what feels fragile. Harden the patterns that keep producing findings.

The shift is not finding exploitable bugs faster. It's fixing weaknesses, repeatably. You run the loop with better models, against code you keep improving. The attacker runs it with worse models, against code that keeps changing under their feet.

Every bug you fix, with a proper test behind it, is gone forever. Every chain an attacker builds can break the next time you push a commit. Your progress compounds. Their job gets harder.

Attackers need to be clever and they're now under-equipped. You just have to be consistent, with better tooling.

Want to build these skills hands-on?

PentesterLab has 700+ real-world labs on web hacking, code review, and vulnerability analysis. Start with a free account.

Create Free Account Go PRO

Louis Nyffenegger

Founder and CEO @PentesterLab