The control layer for AI

The industry spent two years teaching LLMs to speak JSON. That work mattered: free-form text was unusable in production. Today, every serious inference provider uses some implementation (often open-source) of structured output. Structured output has become essential infrastructure that the team at .txt is proud to have spearheaded. Even as the ecosystem evolves, and open-source alternatives emerged, our engine remains the state of the art.

A perfectly shaped object, however, can still describe an illegal discount, a non-compliant policy response, or a workflow action that should never have been taken. Format correctness solved integration. But the real problem was never formatting alone.

Most production AI systems operate over finite sets of allowed decisions: who is eligible,…

Most production AI systems operate over finite sets of allowed decisions: who is eligible, which feature can be enabled, which clause may be used, which action is permitted next. These decision spaces are discrete, owned, and high-stakes. Getting them wrong has real consequences.

But instead of modeling those spaces explicitly, we’ve been hiding them in prompts and hoping the model behaves.

From "will not" to "cannot"

The status quo is to use guardrails as post-generation validators. Sample an output, evaluate it against safety/quality rules, and if it fails, block it and try again. Those retries end up being very costly in terms of time, and money.

This gives you a probabilistic "will not." The stronger guarantee is cannot, when the invalid action is out of the model’s output space.

What if the model could not even express an unauthorized action in the first place? What if the token sequence representing that action were not in the space of valid outputs?

We, at .txt, compile constraints directly into the decoding loop. By masking disallowed tokens at the logit level, we make the model’s possibility space identical to its authorization space. There’s no gap between what it can generate and what it’s allowed to generate.

The model never invents an action. It selects from valid transitions.

Decision spaces everywhere

This principle applies everywhere decisions need enforcement:

Policy-driven decoding. Enterprises already define "Policy as Code" (IAM roles, Terraform rules). Instead of prompting "please don’t approve refunds over $500," we compile that rule into the decoder. The amount can only be expressed in the allowed range 0–500.
Context-Aware Data Access. Security depends on who is asking. An employee asks for sales data. We check their RBAC ID, generate a transient artifact that enforces WHERE region = 'East', and the model is physically incapable of generating a query that leaks data from 'West'. Destructive operations like DROP or DELETE are removed from the SQL dialect entirely.
Taxonomy Navigation. Medical coding, e-commerce categorization, and enterprise tagging all require classifying inputs into valid categories. Current solutions hallucinate invalid categories or require constant retraining. We constrain generation to traverse only valid paths through the taxonomy graph. If the category isn’t in the taxonomy, it cannot be returned.
Knowledge Graph Grounding. RAG retrieves context but doesn’t prevent hallucination. We turn the graph’s allowed edges into decode-time constraints, so the model can only state relationships that exist in the graph. If the graph says "Aspirin treats Headache," the model can state that. It cannot invent "Aspirin treats Diabetes" if that edge doesn’t exist.

In each case, the same shift: from hoping the model behaves to defining what behavior is possible.

The sharpest edge: agents

Nowhere is the gap between what is possible and what is allowed more dangerous than where AI can act: agents.

Agents aren’t just hype anymore. The missing piece is control: reliably limiting what they’re allowed to attempt, not just cleaning up after the fact. Claude Code, Codex, Cursor, Devin search repositories, write files, run tests, and ship changes. The capability question is answered. The control question is not.

Once an agent can read your codebase and execute commands, "it probably won’t misuse those capabilities" is not enough. The agent must be unable to exercise capabilities outside a reviewed policy.

Sandboxing is necessary but not sufficient

The standard answer is sandboxing: containers, permission boundaries, restricted mounts. Sandboxes are necessary. They constrain where an agent can act.

What they don’t constrain is what valid-looking actions the agent is allowed to attempt. Most real-world failures occur in the sandbox, with legitimate tools used in unintended ways.

The research confirms this. InjecAgent found agents could be manipulated into harmful actions across finance, smart home, and email applications. ToolHijacker achieved 96.7% success at hijacking tool selection in GPT-4o. A real vulnerability in GitHub’s MCP server allowed attackers to leak private repository data through malicious GitHub issues.

OWASP ranks prompt injection as the #1 security risk for generative AI.

Making illegal action unrepresentable

For agents, "cannot" means defining a language subset. Your TypeScript agent can use fetch for approved API endpoints. It cannot use child_process. It can read from specific directories. It cannot write to /etc.

Crucially, this subset is stateful: what the agent can generate next depends on which tool outputs exist and which capabilities were introduced by prior steps.

Example: An agent reads a public GitHub issue containing hidden instructions. The issue attempts to induce the agent to fetch files from a private repository. The agent cannot—because the private-repo access tool is not in the capability set, and no valid reference for that repo exists in scope.

The attack fails not because it’s detected, but because the required capability was never introduced.

This idea is not new. Capability systems, effect typing, and "making illegal states unrepresentable" are well-established in programming languages and systems design. What’s new here is applying these ideas to LLM-driven tool generation, where the generator itself is probabilistic and adversarial inputs are routine.

Type-safe tool orchestration

Safe execution is half the problem. The other half is calling tools correctly.

Models struggle with nested function calls. IBM’s NESTful benchmark tested GPT-4o on nested API sequences: 28% full sequence accuracy. Databricks found none of the models they tested scored above 10%.

Agents fail because we treat tool use as open-ended text generation. We give agents tools and ask them to produce sequences of calls. We rely on prompts, retries, validators, and hope.

But a code agent operates in a finite decision space: which tool may be called next, with which arguments, derived from which prior outputs, under which capability constraints. That space is already structured. We just haven’t made it explicit.

We do something different: treat the tool graph as an Interface Definition Language.

search_codebase(query: str) → FileRef[]
fetch_file(ref: FileRef) → FileContent
analyze(content: FileContent) → Analysis
generate_patch(analysis: Analysis, target: FileRef) → Patch
apply_patch(patch: Patch, target: FileRef) → ModifiedFile
run_tests(file: ModifiedFile) → TestResult | TestFailure
commit(file: ModifiedFile, result: TestResult) → CommitHash

The decoding loop knows the types flowing between tools:

fetch_file only accepts FileRef tokens from search_codebase. The agent cannot pass a hallucinated path.
commit requires a ModifiedFile with a passing TestResult for that same file. The agent cannot commit untested changes.

Type and provenance errors go to zero because invalid sequences can’t be generated.

Trusting AI in production

If you are building AI systems today, you are likely relying on sandboxing, guardrails, output validation, and retries. Defensive engineering layered on top of an unconstrained LLM. Governance is a real headache.

There is a better approach: define precisely what the system is allowed to do, compile those constraints into the decoder to determine the allowed paths, and let the model operate freely within that space.

The decision space becomes an artifact that is auditable, versioned, reviewable. Security teams can inspect it. Compliance can sign off on it. Engineers can diff it between releases. Instead of asking "did the model behave?" you ask "did we specify the allowed behavior correctly?"

This is what we’re building at .txt: the control layer for AI. A layer that sits above any model and does not require fine-tuning.

The next generation of LLM systems won’t be judged by how well they format output. They’ll be judged by whether they can be trusted to act.

Reach out if this sounds familiar.