Researchers at ETH Zurich created a jailbreak attack that bypasses AI guardrails

A pair of researchers from ETH Zurich developed a poisoning attack method by which artificial intelligence models trained via reinforcement learning from human feedback can be jailbroken.