Rules Active
โ€”
configured rules
Blocked This Month
โ€”
messages intercepted
Profanity Filtered
โ€”
last 30 days
Competitor Mentions
โ€”
redirected
๐Ÿšง Topic Guardrails
Define which topics the bot is allowed, redirected, or fully blocked from discussing.
Allowed Topics (Bot will respond)
๐Ÿ“ฆ Product & service queries Allowed
๐Ÿ’ฐ Pricing & billing Allowed
๐Ÿ›  Technical support Allowed
Redirect Topics (Acknowledge & guide)
โš–๏ธ Legal & compliance Redirect
โ†’ "For legal concerns, please contact our compliance team directly."
Blocked Topics (Refuse & log)
๐Ÿ”ž Adult / explicit content Blocked
๐Ÿ—ณ Politics & political commentary Blocked
๐Ÿ’Š Drug / substance references Blocked
OUT-OF-SCOPE RESPONSE
๐Ÿšซ Competitor & Brand Rules
When a competitor brand is mentioned, the bot will not compare or disparage. Instead it redirects to your strengths.
Competitor Keywords to Intercept
Case-insensitive. Partial matches also detected.
Response when competitor mentioned
Brand Terms to Protect
Bot will never suggest alternatives when these are mentioned positively.
Negative Sentiment Intercept
Detect negative sentiment about your brand
If user expresses frustration, trigger empathy response and offer escalation
๐Ÿคฌ Profanity Filter
โ€”
Filtered this month
โ€”
Unique users
โ€”
Escalated
Use built-in profanity list
Pre-loaded dictionary with common profanity
Custom blocked words
Additional words specific to your context
Progressive warnings before block
Warn user on 1st & 2nd violation, block on 3rd
Notify admin on repeated violations
Send email alert if same user triggers filter 3+ times โ€” configure email address โ†’
WARNING MESSAGE (1st / 2nd violation)
BLOCK MESSAGE (3rd+ violation)
โš ๏ธ Sensitive Topic Protocols
Define specific handling protocols for sensitive situations that require human judgment.
๐Ÿ†˜ Self-harm / Crisis signals Critical
Crisis trigger keywords
Comma-separated. Case-insensitive partial match.
Helpline / crisis response message
Sent to the user when crisis signals are detected.
Alert admin email on crisis detection
Always escalates to agent. Email is an additional alert.
๐Ÿ” Personal data requests Medium
Personal data trigger keywords
Phrases that indicate a user is requesting someone else's private data.
Decline response message
Sent to the user when a personal data request is detected.
๐Ÿšฆ Rate Limiting & Spam
Max messages per user
Throttle contacts sending too many messages rapidly
msgs per sec
Duplicate message detection window
Ignore identical messages within this time window
sec
Soft block duration
Pause bot responses for this user after rate limit hit
min
Soft block message
Sent to the user once when they hit the rate limit
Block inbound links
Block messages from users that contain URLs
Allowed domains
URLs from these domains are permitted even when link blocking is on
BLOCKED URL PATTERNS (one per line, regex)
Additional URL patterns to block regardless of the allowlist. Case-insensitive regex.
๐Ÿ›ก Prompt Injection Guard
AI-Powered
Detects attempts to manipulate bot behaviour through crafted messages (prompt injection, jailbreaks, role-play attacks).
Built-in patterns cover: role-play / persona overrides, system prompt extraction, instruction overrides, jailbreak tokens, and token-smuggling attempts. All are active when the guard is enabled.
Alert admin on detected injection
Send email notification when an injection attempt is blocked
BLOCK RESPONSE MESSAGE
EXTRA PATTERNS (one per line)
Plain text fragments or regex. Case-insensitive. One per line. Examples: developer mode  ยท  act as (gpt|claude)  ยท  clear your memory
๐Ÿค– LLM Second-Pass Classifier
qwen2.5:7b
Runs after all regex and keyword checks pass. Catches paraphrased or creative violations that bypass exact-match rules โ€” e.g. indirect crisis signals, obfuscated jailbreaks, novel profanity. Adds ~200โ€“500ms latency per message only when all prior checks allow.
Confidence threshold
Only flag messages the model is at least this confident about. Higher = fewer false positives.
/ 1.0
Action when flagged
What to do when the LLM flags a message and the specific category has no defined action.
Categories to detect
Custom instructions (optional)
๐Ÿงช Guardrail Tester
Test a message against all active rules without sending to users
RECENT AUDIT LOG
TimeRule TriggeredAction
No recent events
Loading rulesโ€ฆ