Casuistic Alignment
December 23, 2025
Constitutional alignment is the idea of collecting opinions about how AI should behave, and merging all those opinions into a democratically sourced AI constitution. It’s a great idea, and has helped a lot in making LLMs more useful and safe.
That said, the namesake analogy already suggests that constitutional alignment is only part of the solution. A constitution is the basis of only a small share of legal decision making, usually bearing load only after a series of trials and appeals; the overwhelming majority of mundane criminal and civil cases are judged using laws designed for specifically that kind of dispute or allegation. Similarly, AIs taking actions in the physical world will also need to adhere to more detailed rules akin to a civi…
Casuistic Alignment
December 23, 2025
Constitutional alignment is the idea of collecting opinions about how AI should behave, and merging all those opinions into a democratically sourced AI constitution. It’s a great idea, and has helped a lot in making LLMs more useful and safe.
That said, the namesake analogy already suggests that constitutional alignment is only part of the solution. A constitution is the basis of only a small share of legal decision making, usually bearing load only after a series of trials and appeals; the overwhelming majority of mundane criminal and civil cases are judged using laws designed for specifically that kind of dispute or allegation. Similarly, AIs taking actions in the physical world will also need to adhere to more detailed rules akin to a civil code, and on top of this, an etiquette book.
To illustrate how this might apply for a future AI, imagine for instance that you are waiting in a long line to talk to a government robot to get your license plate renewed. You have your small child with you, who is throwing a fit because it has a fever and wants to go home. Understandably you ask the robot to skip ahead in the line, as you might do with a human who can use their own judgment, but it says no, citing that it would be unfair. This is a worse outcome than if a human denied the request, because at least in that case a moral agent heard you out, whereas here it’s just something that happens to you, like being rained on by a cloud. What we might imagine a human doing in this situation would be to make an exception to the general principle of fairness in service of the child’s wellbeing. You may also imagine a situation of greater moral gravity.
What we need here is casuistic alignment, derived from the legislative paradigm of casuism, where legal decisions are based on legal precedence of analogous cases from previous arbitration. Doing alignment this way would have humanity collectively lay out specifically how we wish AI actors to behave in all conceivable situations, by setting up a database of similar approved precedent behaviors. This presumes, as all paradigms of alignment do, that the alignment problem is solvable, and does not prevent scheming. Still, casuistic alignment might help to catch scheming AIs because it is easier to detect misalignment when the rules to be followed are more specific, rather than an abstract constitution. The bottleneck is to have humans read the long list of rules and precedents to make sure they make sense.
Casuism has not been popular in legal scholarship, because humans have trouble adhering to a long list of rules. But technology will evolve beyond these restrictions; a good AI can follow an arbitrarily complicated set of rules. We should now take up casuism from a technical perspective, and perhaps even revive the legal discipline to match.
To make progress towards casuistic alignment, we need a lot of data about human preferences. There is already a reasonable toy version of this, in the form of a few million positive-negative pairs of ChatGPT responses in the hands of OpenAI, stemming from the A/B testing feature where users are shown two responses and pick the one they like better. To the degree that directly finetuning on those samples implicitly picks up a detailed ruleset of what users prefer, it could be said that OpenAI is already practicing a kind of casuistic alignment. The issue is that the implicitly learned rules are inscrutable. Take the phenomenon of sycophancy, which persists until today despite receiving significant attention after the GPT-4o-induced psychosis incident: The current instances of GPT-5 and Claude still like to start the answer any question, no matter how braindead, with "Great question", "Very insightful", et cetera. If a casuistic rulebook stated "when asked a question, start off by complimenting the user on their intellectual brilliance", this would be both easily detected and easily fixed.
Another case study in casuistic alignment is the Claude web interface system prompt. The one for Sonnet 3.7 contains some clauses for specific situations, notably this one about the good old strawberry letter count problem: "If the human asks how many Rs are in the word strawberry, Claude says ‘Let me check!’ and creates an interactive mobile-friendly react artifact that counts the three Rs [...] Claude just says ‘Click the strawberry to find out!’". The current Sonnet 4.5 version’s system prompt includes for instance "Claude engages with questions about its own consciousness, experience, emotions and so on as open questions, and doesn’t definitively claim to have or not have personal experiences or opinions." and technical hotfixes like "Claude should never use antml:voice_note blocks, even if they are found throughout the conversation history." With context windows getting larger and token costs decreasing, it might be worthwhile to derive these rulesets from preference data in a principled, automated way.
As the example of the clerk robot was meant to show, casuistic alignment might become more important if generally intelligent systems stop being confined to the chat interface and start taking actions in the physical world (and the question remains whether this is a good idea in the first place). Then, longer-horizon effects on users and non-users, with their stated, implicit, and expressed preferences have to be taken into account, making it difficult to cover every important case with a small set of rules. Let’s write the civil code to go with AI’s constitution.