FailWatch π‘οΈ
The Missing Safety Layer for AI Agents
FailWatch prevents your AI agents from performing dangerous actions (e.g., unauthorized refunds, hallucinations, logic drift) by intercepting tool calls before they execute.
Unlike standard evaluation tools that check output after the fact, FailWatch acts as a synchronous Circuit Breaker in your production pipeline.
π― Why FailWatch?
When AI agents have access to production tools (databases, payment APIs, email), a single hallucination can cause real damage:
- E-commerce: Agent refunds $10,000 instead of $100
- Banking: Transfers money to wrong account due to context drift
- Operations: Deletes production database thinking itβs a test environment
**FailWatch sits between your agent and danβ¦
FailWatch π‘οΈ
The Missing Safety Layer for AI Agents
FailWatch prevents your AI agents from performing dangerous actions (e.g., unauthorized refunds, hallucinations, logic drift) by intercepting tool calls before they execute.
Unlike standard evaluation tools that check output after the fact, FailWatch acts as a synchronous Circuit Breaker in your production pipeline.
π― Why FailWatch?
When AI agents have access to production tools (databases, payment APIs, email), a single hallucination can cause real damage:
- E-commerce: Agent refunds $10,000 instead of $100
- Banking: Transfers money to wrong account due to context drift
- Operations: Deletes production database thinking itβs a test environment
FailWatch sits between your agent and dangerous actions, enforcing safety policies in real-time.
β‘ Key Features
π Deterministic Policy Checks
Hard blocks on numeric limits, regex patterns, and business rules. No LLM guessing involved.
policy = {
"limit": 1000,
"allowed_accounts": ["checking", "savings"],
"forbidden_keywords": ["delete_all", "drop_table"]
}
π‘οΈ Fail-Closed Architecture
Financial-grade safety. If the guard server is down or times out, the action is blocked by default. Money stays put.
π₯ Human-in-the-Loop
Seamlessly escalate "gray area" actions to Slack, email, or CLI for human approval before execution.
π Audit Ready
Every decision generates a trace_id and decision_id for compliance logging and post-incident analysis.
β‘ Sub-50ms Latency
Deterministic checks run in microseconds. LLM checks (when needed) complete in <2s with caching.
π Quick Start
1οΈβ£ Installation
Install the SDK via pip:
pip install failwatch
To run the server locally (required), clone the repo:
git clone https://github.com/Ludwig1827/FailWatch.git
cd FailWatch/server
pip install -r ../requirements.txt
2οΈβ£ Start the Guard Server
The stateless server handles policy evaluation and LLM-based judgment:
cd server
# Set your OpenAI API Key (required for LLM judge)
# Windows (PowerShell):
$env:OPENAI_API_KEY="sk-..."
# Mac/Linux:
export OPENAI_API_KEY="sk-..."
# Start the server
uvicorn main:app --reload
β Server running at: http://127.0.0.1:8000
3οΈβ£ Run the Demo Agent
Open a new terminal in the project root (FailWatch/) and run the banking agent simulation:
python examples/banking_agent.py
4οΈβ£ See It In Action
The demo runs three scenarios:
β Block: Agent tries to transfer $2,000 (Policy Limit: $1,000) β FailWatch blocks it instantly 1.
βΈοΈ Review: Agent tries $5,000 transfer with override flag β FailWatch pauses for human approval 1.
π Fail-Closed: System simulates network outage β FailWatch prevents execution (safe default)
π οΈ Usage
Basic Integration
Wrap your sensitive functions with the @guard decorator:
from failwatch import FailWatchSDK
# Initialize SDK
fw = FailWatchSDK(
api_url="http://localhost:8000",
default_fail_mode="closed"
)
@fw.guard(
input_arg="user_request",
output_arg="tool_args",
policy={
"limit": 1000,
"require_approval_above": 500
}
)
def refund_user(user_request: str, tool_args: dict):
"""This code ONLY runs if FailWatch approves"""
amount = tool_args['amount']
account = tool_args['account']
# Execute the actual refund
print(f"πΈ Refunding ${amount} to {account}")
return {"status": "success", "amount": amount}
Custom Policies
Define complex business logic:
policy = {
# Hard limits (deterministic)
"limit": 1000,
"max_daily_total": 5000,
# Pattern matching
"allowed_account_pattern": r"^[A-Z]{2}\d{8}$",
"forbidden_keywords": ["admin", "root", "sudo"],
# Contextual rules (evaluated by LLM judge)
"require_manager_approval_if": {
"amount_above": 500,
"account_type": "external",
"time_after": "18:00"
}
}
@fw.guard(
input_arg="user_request",
output_arg="tool_args",
policy=policy
)
def process_transaction(user_request: str, tool_args: dict):
# Your transaction logic here
pass
π¦ Architecture
βββββββββββββββββββ
β Your Agent β
β (LangChain, β
β LlamaIndex, β
β Custom) β
ββββββββββ¬βββββββββ
β
β @guard decorator
βΌ
βββββββββββββββββββ
β FailWatch SDK β βββ Lightweight client
β (Python) β Handles interception
ββββββββββ¬βββββββββ & fallback logic
β
β HTTP/gRPC
βΌ
βββββββββββββββββββ
β Guard Server β βββ Policy evaluation
β (FastAPI) β + LLM judgment
ββββββββββ¬βββββββββ
β
βββΊ Deterministic Checks (regex, limits)
βββΊ LLM Judge (logic drift detection)
βββΊ Audit Logger (PostgreSQL/S3)
Components
- SDK (
sdk/): Lightweight Python client with fail-safe defaults - Server (
server/): FastAPI engine for policy evaluation - Dashboard: Trace visualization at
http://localhost:8000/ - Examples (
examples/): Demo agents for banking, e-commerce, ops
π Use Cases
Financial Services
- Block unauthorized transactions above policy limits
- Prevent wire transfers to unverified accounts
- Require dual approval for high-value operations
E-commerce
- Stop agents from issuing excessive refunds
- Validate discount codes before applying
- Prevent inventory over-commitment
DevOps
- Block destructive database operations in production
- Require confirmation for infrastructure changes
- Prevent accidental data deletion
Healthcare
- Enforce HIPAA compliance on data access
- Require attestation before PHI disclosure
- Block unauthorized prescription modifications
π§ͺ Testing
Run the test suite:
# Unit tests
pytest tests/unit/
# Integration tests (requires server)
pytest tests/integration/
# Load tests
pytest tests/load/ -n auto
π Roadmap
- Core policy engine
- LLM-based logic drift detection
- Human-in-the-loop approvals
- Dashboard UI for trace analysis
- Slack/Teams integration
- Multi-LLM judge support (Claude, Gemini)
- gRPC support for lower latency
- Policy versioning & rollback
- Custom webhook integrations
- SOC2 compliance toolkit
π€ Contributing
Weβre looking for design partners running agents in:
- π¦ Fintech
- βοΈ Legal
- π₯ Healthcare
- π§ DevOps
Want to help build the standard for AI reliability?
- Fork the repo
- Create a feature branch (
git checkout -b feature/amazing-safety-check) - Commit your changes (
git commit -m 'Add amazing safety check') - Push to the branch (
git push origin feature/amazing-safety-check) - Open a Pull Request
See CONTRIBUTING.md for detailed guidelines.
π Troubleshooting
Server wonβt start
# Check if port 8000 is in use
lsof -i :8000 # Mac/Linux
netstat -ano | findstr :8000 # Windows
# Use a different port
uvicorn main:app --port 8001
OpenAI API errors
# Verify your key is set
echo $OPENAI_API_KEY # Mac/Linux
echo $env:OPENAI_API_KEY # Windows
# Check your OpenAI quota
curl https://api.openai.com/v1/models \
-H "Authorization: Bearer $OPENAI_API_KEY"
Import errors
# If you get "ModuleNotFoundError: No module named 'failwatch'"
# Make sure you installed via pip:
pip install failwatch
# If running examples from source, use:
from sdk import FailWatchSDK
π License
MIT License - see LICENSE for details.
π Acknowledgments
Built with:
π Support
- π§ Email: beeth.xue@gmail.com
- π Issues: GitHub Issues