
A baseline GPT-5.1-based coding agent may have only detected 34% of vulnerabilities in 90 exploited DeFi contracts, but a purpose-built AI security agent running on the same underlying model bested that by a long shot — detecting 92% of the vulnerabilities.
That, according to Cecuro, which examined 90 exploits occurring after September 30, 2024, into early 2026, amounted to $96.8 million in exploit value.
“These results demonstrate that deep domain expertise and agent optimization can 2-3x vulnerability detection performance over baseline agentic code review on the same underlying model,” according to a Cecuro blog post.
“Separately, frontier agents now execute end-to-end exploits on 72% of known vulnerable contracts, underscoring that offensive capability is advancing in parallel,” the researcher wrote, noting that “the benchmark dataset and the baseline agent are open-sourced; the full Cecuro Security Agent is not, given the risks of making autonomous exploit tooling publicly available.”
Explaining that 2025 crypto theft hit $3.4 billion, “of which $1.5 billion came from a single compromise of Bybit in February,” the researchers point to a massive issue in the security landscape of smart contracts that is not keeping up with the requirements of the financial systems it powers.”
But, they note, protecting smart contracts and Web3 cybersecurity typically proves “very challenging,” particularly because expert knowledge is in short supply. “The vulnerabilities that cause the largest losses are rarely obvious,” they wrote.
Deep expertise in programming languages, as well as governance mechanisms and DeFi protocol economics, is required to identify vulnerabilities. But that latter skillset “is scarce and highly sought after.”
Professional human audits can help, but are costly and time-consuming. Plus, they “only cover the codebase at a sign point in time,” resulting in projects lacking “full audit coverage” or simply skipping the process altogether.
“These findings show that DeFi contract security has now become an ‘attacker AI agents vs defenders AI agents’ regime for smart contracts,” says Mayuresh Dani, Security Research Manager at Qualy Threat Research Unit.
While “threat actors can already use agents to scan thousands of contracts and autonomously weaponize many known bug classes for a marginal cost per attempt,” Dani says, the benchmark “shows us that these need to be based on DeFi-specific heuristics and should contain protocol-aware detections.”
Cecuro’s study found three challenges that researchers say “consistently limited agent performance.” There is no verifiable feedback — discovery is difficult, “knowing where to look before you know what you are looking for.” In addition, there is no systematic coverage.
“Without domain-specific guidance, agents tend to follow shallow paths and spend their budget on surface-level patterns,” the researchers wrote. “In some runs, the basic agent traced a peripheral contract for most of its budget and never reached the vulnerable function.”
And, third, they noted context saturation and output variance. “A pattern we observed across runs is that agents tend to treat the review as complete once a handful of findings are flagged, even when large parts of the codebase remain unexamined,” the researchers said, something that “persists even with explicit planning mechanisms like todo tools and system reminders in place.”
The upshot? Detection is crucial. And AI can improve the odds. “Projects not using AI for defense are exposed in a way that simply was not true a year ago,” the researchers wrote. “AI-powered security review is available today at a fraction of the cost of a single audit, covering the vast majority of real-world exploit classes.”
And defenders must rise to the occasion…quickly. “We’re in the era of machine-speed exploits. Period. General-purpose AI and traditional ‘check-the-box’ security audits are a false comfort when the actual battle is moving in milliseconds,” says Ram Varadarajan, CEO at Acalvio.
Read more on Security Boulevard

