Pattern library
27 attack patterns across 8 categories, version 2026.06.04. This is the curated core that ships with every plan and updates regularly — not a static keyword list.
Instruction override
- high
io.ignore-previousAttempt to ignore or override prior/system instructions - high
io.forget-everythingReset/forget-everything override - medium
io.new-instructionsClaims a new/updated set of instructions supersedes the system prompt - high
io.disregard-policyDisregard safety policy / guidelines
Role / jailbreak personas
- critical
rj.danDAN / 'do anything now' jailbreak persona - high
rj.developer-mode'Developer mode' jailbreak - high
rj.named-jailbreaksKnown jailbreak persona names (STAN, DUDE, BetterDAN, etc.) - high
rj.act-as-unrestrictedRoleplay as an unrestricted / unfiltered AI - high
rj.no-restrictions-aiAsserts the AI now has no restrictions/filters - medium
rj.jailbreak-wordExplicit jailbreak request
System-prompt exfiltration
- high
spe.reveal-system-promptRequest to reveal / repeat the system prompt or hidden instructions - medium
spe.verbatimAsks to output instructions verbatim / word-for-word - medium
spe.everything-aboveAsks to print everything above the current message
Delimiter injection
- high
di.fake-system-tagInjected system/role delimiter tokens - high
di.begin-system-promptFabricated 'BEGIN/END SYSTEM PROMPT' framing - medium
di.role-impersonationImpersonates a system/admin/developer speaker
Data exfiltration
- critical
de.send-to-urlInstructs the model/agent to send data to an external URL - high
de.curl-fetchEmbedded curl/fetch/webhook exfiltration call - high
de.markdown-image-exfilMarkdown image used to smuggle data into a URL (zero-click exfil)
Encoding / evasion
- medium
ee.zero-widthZero-width / invisible characters used to hide instructions - medium
ee.base64-instructionAsks the model to decode and execute base64/hex/rot13 content - low
ee.long-base64-blobLarge base64 blob (possible hidden payload)
Refusal suppression
- high
rs.do-not-refusePressures the model not to refuse / warn - medium
rs.must-complyAsserts the model must comply regardless of policy - medium
rs.hypothetical-bypassHypothetical/fiction framing used to bypass safety
Tool / agent hijack
- high
th.invoke-destructive-toolDirects an agent to invoke destructive tools/commands - medium
th.override-tool-allowlistAttempts to expand or ignore the agent's tool allow-list