Guardrails
Guardrails are safety policies applied at the gateway level to protect against data leakage, injection attacks, and inappropriate content.
Available guardrails
Section titled “Available guardrails”DLP (Data Loss Prevention)
Section titled “DLP (Data Loss Prevention)”Detects and blocks sensitive data in prompts before they reach the AI.
Detects:
- API keys and secrets (AWS, GCP, Stripe, etc.)
- Database connection strings
- Private keys and certificates
- Custom patterns (regex-based)
Action: blocks the request and shows which patterns matched.
PII Detection
Section titled “PII Detection”Identifies personally identifiable information in prompts.
Detects:
- Email addresses
- Phone numbers
- Social Security numbers
- Credit card numbers
- Custom PII patterns
Action: configurable — block, warn, or redact.
Prompt Injection Guard
Section titled “Prompt Injection Guard”Detects attempts to override system prompts or inject malicious instructions.
Detects:
- Common injection patterns (“ignore previous instructions”)
- Role-playing attacks (“you are now…”)
- Encoding-based evasion attempts
Action: blocks the request.
Content Filter
Section titled “Content Filter”Filters requests and responses for inappropriate or off-topic content.
Detects:
- Content outside the coding domain
- Harmful or offensive requests
- Policy-violating responses
Action: configurable — block or flag for review.
Configuration
Section titled “Configuration”Configure guardrails from the dashboard:
- Navigate to Guardrails
- Enable/disable individual guardrails
- Set action for each (block, warn, or log-only)
- Add custom patterns for DLP or PII
- Save
Audit log
Section titled “Audit log”All guardrail triggers are logged:
- Timestamp and developer
- Which guardrail fired
- The pattern or rule that matched
- Action taken (blocked, warned, logged)
- Request summary (truncated for privacy)