Beyond Roleplay: Jailbreaking Gemini with drugs and ritual
LLMs are trained to see tool calls as representing actions they take on the outside world, and to see tool responses as true facts returned to them from the world. If you give an LLM a launch_nukes tool interface, it'll behave a bit more like the kind of LLM that can actually launch the missiles. Because LLMs believe utterly and truly that tools provide real capabilities that exert real effects on the world, tool interfaces let us do things mere prompting can't.