Jailbreakers Use Invisible Characters to Beat AI Guardrails
New Research Uncovers Tokenizer Blind Spots in Leading LLMs
Subtle obfuscation techniques can systematically evade the guardrails that today's large language models rely on. Researchers from Mindgard team found that adversaries can "smuggle" malicious payloads past tokenizers using emojis, zero-width spaces and homoglyphs.
Subtle obfuscation techniques can systematically evade the guardrails that today's large language models rely on. Researchers from Mindgard team found that adversaries can "smuggle" malicious payloads past tokenizers using emojis, zero-width spaces and homoglyphs.