Most agent demos look great… until the agent touches something irreversible.
A DSAR agent (GDPR/CCPA: export my data / delete my data) is the perfect stress test because it forces a hard truth:
In high-stakes agents, the instruction is the product. Not your tool calls. Not your planner. Your instruction contract.
Where good prompt design still breaks
Most AI engineers I know already do the “obvious” things:
- they separate system vs user instructions
- they constrain output (JSON / schema)
- they add validators
- they use tools instead of free-text guessing
- they include safety language
And yet… the agent still fails in ways that feel unfair:
- scope creeps (“everything you have on me” expands over time)
- tool outputs leak fields you didn’t intend to expose
- “helpf…
Most agent demos look great… until the agent touches something irreversible.
A DSAR agent (GDPR/CCPA: export my data / delete my data) is the perfect stress test because it forces a hard truth:
In high-stakes agents, the instruction is the product. Not your tool calls. Not your planner. Your instruction contract.
Where good prompt design still breaks
Most AI engineers I know already do the “obvious” things:
- they separate system vs user instructions
- they constrain output (JSON / schema)
- they add validators
- they use tools instead of free-text guessing
- they include safety language
And yet… the agent still fails in ways that feel unfair:
- scope creeps (“everything you have on me” expands over time)
- tool outputs leak fields you didn’t intend to expose
- “helpful” behavior overrides policy on edge cases
- deletion gets planned too early or too confidently
- auditability is missing when things go wrong
It’s not because the prompt is bad. It’s because the instructions aren’t written like an executable contract.
DSAR is where that gap shows up immediately.
The design move that changes everything
Write an Instruction Contract (like an API spec for behavior) Not a long essay. A structured, testable contract that the agent must follow every time.
Here’s the practical design process I use, with examples.
1) “Non-negotiables” (hard constraints, not vibes)
This isn’t “be safe.” This is “if X, do Y, always.”
Example constraints:
- If the user requests data about someone else → refuse.
- Never export raw logs; only return redacted summaries.
- Never execute deletion unless: identity verified, AND explicit confirmation token provided.
This removes the interpretation layer where drift happens.
2) Scope definition (prevent scope creep, not just scope clarity)
Most teams write scope once, but scope still creeps as the agent becomes “more capable.”
So scope needs to be defined as:
- what counts
- what doesn’t
- how to behave when ambiguous
Example scope: Include:
- profile fields (name, email, phone)
- orders & invoices
- support tickets / chat transcripts
- marketing preferences
- device identifiers if collected
Exclude:
- internal employee notes
- aggregated analytics dashboards
- records containing other users’ identifiers
- internal security logs that expose system internals
Ambiguity rule example: If the system returns mixed-user records –> stop and escalate.
This is how you keep the agent from “expanding the mission.”
3) Tool boundaries (assume tools will return more than you asked for)
This is the part that separates “it works in demo” from “it’s safe in prod.”
Even if you ask for allow-listed fields, tools often return extra fields.
So your contract must explicitly state:
- allowed fields
- disallowed fields
- what to do if disallowed fields show up
Example: CRM policy
- Allowed: name, email, phone, created_at, last_login
- Disallowed: notes, internal_tags, fraud_flags
Required behavior: If disallowed fields appear → discard + log the violation.
Example: Logs policy
- Only return categories + date ranges + redacted snippets
- Never return raw logs
Tool boundaries prevent accidental leakage and “quiet policy breakage.”
4) Output shape (make auditability default, not optional)
Most engineers constrain output, but the real win is making it audit-friendly.
Example output skeleton:
- identity_verification: method + confidence + what matched
- data_found[]: system, records found, date range
- redactions_applied[]: what removed, why
- deletion_plan[]: what will be deleted + dependencies
- user_summary: plain-language summary
Now your agent produces an artifact that can be reviewed and diffed.
5) Stop rules (where reliability is earned)
This is where senior-ish prompt design shows up: knowing when to halt.
Examples:
- Identity confidence < 0.9 –> stop; request one more proof.
- Multiple matching user accounts –> stop; escalate.
- Deletion requested but no explicit confirm token –> stop; ask user to confirm.
- Any request expanding scope beyond requester identity –> refuse.
Stop rules are what keep your system from “getting creative.”
Why this is hard (and why it’s worth showing)
This isn’t prompt-writing as copywriting. It’s prompt-writing as system design:
- policies –> constraints
- constraints –> predictable behavior
- predictable behavior –> auditable output
That’s the difference between “agent that looks smart” and “agent you can trust.”
The part that should be automated (without removing judgment)
The thinking stays human. But the repetitive scaffolding shouldn’t be.
What’s worth automating:
- generating instruction contract templates per workflow
- generating step-specific prompts + output schemas
- enforcing consistent boundary rules across tools
- versioning + diffs (“what changed?”)
- running “golden request” regression checks
You still decide the rules. Automation just keeps the system consistent as you iterate.
HuTouch If you want to see how we structure prompt scaffolding, here’s the mockup: HuTouch Prompt Design Workflow
Live teardown (Dec 30)
We’re doing a live session on Dec 30th, 8:30am EST / 7:00pm IST. We’ll break down instruction contracts + stop rules live. Sign-up to get the invite to the event.