FEATURED POST

November 13, 2025

Sherlock AI V2 interview with Bernhard Mueller on the upgrade.

Inside the Lab: Berndt on the Sherlock AI V2 Upgrade

The upgrade from v1 to v2 has been a monumental task for the Sherlock AI team: months of integrating customer feedback, refining the model on production audits, and stress-testing against edge cases that exposed v1’s limits. What emerged is a smarter, more adaptable system that reasons like an auditor, scales like software, and continues to learn from every run and codebase engagement.

We sat back down with Bernhard Mueller, Sherlock’s AI Research Lead, to break down what changed under the hood and what v2 means for the future of AI-driven security.

Berndt, what stands out as the biggest improvement in Sherlock AI v2 compared to v1?

Berndt: V2 uses deeper, agentic multi-step reasoning that mirrors how the top researchers on our team actually work through an audit. The system now models the target codebase more effectively, explores the vulnerability space more efficiently, and confirms or rejects hypotheses with more scrutiny. TL;DR: It finds more high-impact bugs but fewer false positives.

What new capabilities are you seeing that make the model more useful in real audits or research?

Berndt: V2 can now handle multiple smart-contract languages rather than only Solidity. It allows us to cover a wider array of vulnerability classes, and to integrate traditional testing techniques far more easily, including practices borrowed from formal verification.

What does Sherlock AI v2 lay the groundwork for as Sherlock AI rolls into the future?

Berndt: Because V2 runs on a LangGraph-native pipeline, we can orchestrate highly flexible agentic flows where specialized agents cooperate intelligently. This makes it trivial to plug in experimental modules, benchmark them against established ones, and keep the architecture improving continuously. We can integrate the next AI breakthrough the moment it appears.

What were the hardest challenges in upgrading from v1 to v2?

Berndt: The hardest part for me is keeping pace with the rapid pace of the AI field itself. Major unlocks now arrive several times a year, so we have to reframe AI-driven security analysis from first principles again and again. AI audits today look nothing like they did two years ago, and there’s no historical playbook to copy from — we keep inventing, revising, and re-inventing the process in real time.

How did your background and expertise specifically influence the model’s development?

Berndt: Coming from hands-on security auditing, I always try to simulate the way a top auditor navigates a codebase: how they map business logic, how their focus shifts as the investigation deepens, and how mental models evolve when triaging potential bugs, and a lot more. There’s a massive design space beyond simply “show the code to a model,” and our architecture was engineered to reflect the real human workflows that consistently uncover critical vulnerabilities.

How will v2 change how protocols approach audits and security preparation?

Berndt: Regarding the upgrade from v1 to v2, the UI is the same and teams can keep their current workflow. Under the hood though, v2’s backend intelligence will keep improving. That means protocols get sharper security analysis with the same lightweight experience, so confidence grows while the operational burden stays low.

What do you see as the biggest opportunities to expand Sherlock AI moving forward?

Berndt: Next on our roadmap is first-class multi-language support. Clients are also asking for adjacent capabilities such as build-script audits and optimization guidance, and we’re exploring deeper integration with dynamic techniques like fuzzing and symbolic execution.

You’ve now trained a model that can reason about code vulnerabilities better than most auditors could a year ago. What’s the last thing that will always require a human - and how long until we’re wrong about that too?

Berndt: Even with v2 approaching human-level performance, humans still play the role of triaging findings and balancing technical severity with business impact. The gap is closing quickly, but we expect humans to stay in the pilot seat for the foreseeable future.

Any final thoughts on what Sherlock AI v2 represents for the broader security field?

Berndt: Sherlock AI v2 marks a shift from AI-assisted auditing to true AI-integrated security. It’s not about replacing human auditors, but about scaling their expertise — taking what the best researchers know and embedding it into a system that learns, adapts, and improves continuously. Every audit, every contest, every fix feeds back into the model, and that feedback loop is what makes it powerful.

Special thanks to Bernhard Mueller for taking the time to walk us through the work behind Sherlock AI v2 and the vision driving what comes next.

Sherlock AI v2 marks another step toward what we’ve been building from the start: a security system that learns, adapts, and improves with every protocol it protects. The release of v2 moves us closer to a future where security is active within development instead of lagging behind it.

Teams interested in seeing how Sherlock AI v2 fits into their workflow can schedule a walkthrough with our team.