Guardrails

Guardrails help ensure agents operate safely and within defined boundaries. They validate inputs, filter outputs, detect sensitive information, and enforce policies that prevent unwanted or harmful behavior before it causes problems. Use guardrails when handling user data, processing sensitive information, building public-facing agents, or operating in regulated industries.

Built-in Guardrails

PII Detection

Guardrails automatically detect and handle various types of personally identifiable information (PII) in both user inputs and agent outputs:

email - Email addresses
credit_card - Credit card numbers (validated with Luhn algorithm)
ssn - Social Security Numbers
phone - Phone numbers
ip_address - IP addresses (validated with stdlib)
url - URLs

Content Moderation

Guardrails can detect and block toxic or harmful content using OpenAI’s moderation API:

block_toxic - If True, blocks toxic, harmful, or inappropriate content before it reaches the model (requires OPENAI_API_KEY environment variable)

PII Handling Strategies

Each PII type can use one of these strategies to handle detected information:

Strategy	Description	Example
`"block"`	Raise an error when PII is detected	Error thrown, execution stops
`"redact"`	Replace with `[REDACTED_<TYPE>]`	`[REDACTED_EMAIL]`
`"mask"`	Partially obscure content	`**---1234` for credit cards, `u*@example.com` for emails
`"hash"`	Replace with deterministic SHA256 hash	`<email_hash:a8f5f167...>`
`None`	Ignore this PII type	No action taken

Basic Usage

Create a guardrail and add it to your agent. Guardrails automatically process both input and output:

import os
from hypertic.agents import Agent
from hypertic.models import xAI
from hypertic.guardrails import Guardrail

# Create guardrail
guardrail = Guardrail(
    block_toxic=True,
    email="redact",
    credit_card="mask",
)

# Create agent with guardrail
agent = Agent(
    model=xAI(api_key=os.getenv("XAI_API_KEY"), model="grok-4-1-fast-non-reasoning"),
    guardrails=[guardrail],
)

response = agent.run("My email is [email protected] and card is 5105-1051-0510-5100.")
print(response.content)
# Email is redacted and credit card is masked in the response

Content Moderation

Use OpenAI’s moderation API to detect and block toxic or harmful content:

import os
from hypertic.agents import Agent
from hypertic.models import OpenAIChat
from hypertic.guardrails import Guardrail

guardrail = Guardrail(
    block_toxic=True,  # Block toxic content using OpenAI moderation
)

agent = Agent(
    model=OpenAIChat(model="gpt-5.2"),
    guardrails=[guardrail],
)

# Toxic content will be blocked before reaching the model
response = agent.run("User input with inappropriate content")
print(response.content)

Configuration:

block_toxic - If True, block toxic or harmful content using OpenAI moderation API (requires OPENAI_API_KEY environment variable)

Custom Patterns

Add custom detection patterns for domain-specific sensitive information:

import re
import os
from hypertic.agents import Agent
from hypertic.models import OpenAIChat
from hypertic.guardrails import Guardrail

guardrail = Guardrail(
    # Define custom patterns for your domain
    custom_patterns={
        # Simple regex pattern (defaults to "redact" strategy)
        "customer_id": re.compile(r"CUST-\d{6}"),
        
        # Custom detector function with specific strategy
        "order_id": (
            lambda text: re.findall(r"ORD-\d{8}", text),
            "hash"  # Hash order IDs
        ),
        
        # API key pattern with block strategy
        "api_key": (
            lambda text: re.findall(r"sk-[a-zA-Z0-9]{32}", text),
            "block"  # Block API keys entirely
        ),
    }
)

agent = Agent(
    model=OpenAIChat(model="gpt-5.2"),
    guardrails=[guardrail],
)

response = agent.run("Process order ORD-12345678 for customer CUST-123456")
print(response.content)

Custom Pattern Formats:

Compiled regex - Uses default “redact” strategy
Tuple (detector_func, strategy) - Custom detector function that returns a list of matches, with specified strategy

Get started

Agent

Workflow

Built-in Guardrails

PII Detection

Content Moderation

PII Handling Strategies

Basic Usage

Content Moderation

Custom Patterns

Get started

Agent

Workflow

​Built-in Guardrails

​PII Detection

​Content Moderation

​PII Handling Strategies

​Basic Usage

​Content Moderation

​Custom Patterns

Built-in Guardrails

PII Detection

Content Moderation

PII Handling Strategies

Basic Usage

Content Moderation

Custom Patterns