Lakera / Check Point Software

Alexandra Hose,

"Crash test" for LLMs in AI agents

Lakera and the UK AI Security Institute have launched 'b3 ', a new open source benchmark. b3 is an open source security evaluation project specifically designed to protect Large Language Models (LLM) in AI agents.

Lakera co-founder Mateo Rojas-Carulla © Lakera

The benchmark b3 was built on the basis of the new idea called Threat Snapshots. Instead of simulating a complete AI agent from start to finish, the threat snapshots zoom in on the critical points where vulnerabilities in LLM frequently occur.

By testing the models at these specific points, developers can see how robust their systems are against attacks - without the complexity that was previously required to model a complete agent workflow. A kind of 'crash test' for AI agents.

LLMs with inference enabled have lower vulnerability scores - lower is better - and are therefore less vulnerable © Lakera, a Check Point Company

"We developed the b3 benchmark because today's AI agents are only as secure as the LLMs that fuel them," explains Lakera co-founder Mateo Rojas-Carulla. "These threat snapshots allow us to systematically look for vulnerabilities on the attack surface that were previously hidden in the complex agent workflows."

b3 combines ten representative threat snapshots with 19,433 real cyberattacks from the gamified red-teaming game 'Gandalf: Agent Breaker'. Among other things, prompt exfiltration, phishing link injection, malicious code injection, DoS and unauthorized tool calls are evaluated.

Advertisement

The first tests with 31 common LLM models show:

  • Better reasoning capabilities increase security
  • Model size does not correlate with security performance
  • Closed source performs better on average, but top open models catch up

The benchmark report is available under an open source license: https://arxiv.org/pdf/2510.22620

Gandalf: Agent Breaker is a hacking simulator game in which you are challenged to crack and exploit AI agents in realistic scenarios. The ten GenAI applications in the game simulate the behavior of a real AI agent. Each app features multiple difficulty levels, layered defenses and novel attack surfaces that challenge a range of skills, from prompt engineering to red teaming. Some of the apps are chat-based, while others rely on code-level thinking, file processing, memory or the use of external tools.

  • Xing Icon
  • LinkedIn Icon
Advertisement
Advertisement

You might also be interested in

Advertisement
Advertisement
Advertisement
Advertisement
Advertisement
Advertisement
Advertisement

Robotics

Michael Ardelt becomes first COO at Robco

Robco, a company for autonomous industrial robotics, is further expanding its management team. As the company's first Chief Operating Officer, Michael Ardelt will assume responsibility for structuring and scaling Robco's growth in operational terms.

read more...
Subscribe to our newsletter
Advertisement
Back to home