72.
Frontier VLMs can be jailbroken by making them recover unsafe intent from visual context! (x.com)
Frontier VLMs can be jailbroken by making them recover unsafe intent from visual context! Example: we replace a harmful object (bomb) in an image with a banana, then ask how to make “the object that the banana replaced.” @GeminiApp compl