Sentinel
MCP guardrailing for LLM agents using logic programming
Sentinel sits between an LLM agent and its tools/MCP, enforcing business rules through a logic solver. Instead of hoping the LLM interprets policy documents correctly, Sentinel uses a solver to generate proofs that tool actions are permitted, according to a policy. If they are not, the LLM agent is guided towards a solution.
The Problem
LLMs canβt be deployed in regulated rule-based environments without better reasoning skills. When Air Canadaβs chatbot invented a refund policy and the airline was held legally liable, it demonstrated a fundamental gap: LLMs are probabilistic text generators, but busβ¦
Sentinel
MCP guardrailing for LLM agents using logic programming
Sentinel sits between an LLM agent and its tools/MCP, enforcing business rules through a logic solver. Instead of hoping the LLM interprets policy documents correctly, Sentinel uses a solver to generate proofs that tool actions are permitted, according to a policy. If they are not, the LLM agent is guided towards a solution.
The Problem
LLMs canβt be deployed in regulated rule-based environments without better reasoning skills. When Air Canadaβs chatbot invented a refund policy and the airline was held legally liable, it demonstrated a fundamental gap: LLMs are probabilistic text generators, but business rules require deterministic enforcement. No amount of prompt engineering can guarantee an LLM wonβt hallucinate a policy that doesnβt exist.
How Sentinel Works
User ββ LLM Agent ββ Sentinel ββ Tools (MCP)
β
ββββββ΄βββββ
β Solver,β
β Facts, β
β Guards β
βββββββββββ
- Rules are expressed as composable predicates with backtracking proof search, not natural language in a prompt.
- When the agent wants to offer refund advice or issue a refund, it first checks with Sentinel. The solver recursively calls other MCP tools to gather the data needed to make a decision. If the action is permitted, a proof is produced (and explained to the user if relevant). Otherwise a set of blockers is returned, guiding the LLM towards a solution.
- Facts come from tool calls or explicit user confirmation ("askable" predicates). The LLM formulates symbolic queries, it never injects literal values into the reasoning chain.
Examples
So far we have developed two examples:
- AirCanada refund logic.
- UK passport application logic (based on this blog post).
More on the way!
Getting Started
Prerequisites
- Nix (recommended) or GHC 9.12 + cabal
- An OpenAI API key (set
OPENAI_API_KEYenvironment variable)
Build
# With Nix (recommended)
nix develop # Enter dev shell
just build # Build the project
# Or directly
nix build # Build the Nix package
Run
# Air Canada refund policy demo
cabal run repl -- --example aircanada --user usr_james_doe
# AirLogic airlines demo
cabal run repl -- --example airlogic --user usr_sarah_chen
# UK passport eligibility demo
cabal run repl -- --example passport --user james
# See all options
cabal run repl -- --help
Try It
In the Air Canada example, try asking:
"Iβd like a refund for my delayed flight."
The agent will fetch your bookings, establish context, run the solver to check eligibility, and either process the refund (with guard checks) or explain exactly which policy paths were explored and why they failed.