jailbreak

jailbreak

A direct prompting attack intended to circumvent restrictions placed on model outputs, such as circumventing refusal behaviour to enable misuse.

📚 Reference: NIST AI 100-2e2025
🏷️ Category: Cybersecurity
📊 Commonality: common