Deception and Scheming in AI Agents: Mikita Balensi on How Your AI Will Lie to Achieve Its Goals

Mikita Balensi talks about how LLMs can deceive their users when put under pressure, and how frontier models are capable of in-context scheming. Most importantly he explains what we can do to prevent a catastrophe from misaligned LLM agents.

Next
Next

The Time of Troubles: Asher Brass on Recognizing and Responding to AI Existential Threats