Deception and Scheming in AI Agents: Mikita Balensi on How Your AI Will Lie to Achieve Its Goals

Mar 23

Mikita Balensi talks about how LLMs can deceive their users when put under pressure, and how frontier models are capable of in-context scheming. Most importantly he explains what we can do to prevent a catastrophe from misaligned LLM agents.