Abstention | Sutro Handbook

Abstention is the decision to not assign a normal task label. Instead of forcing the classifier to choose from the main label set, you give it a way to say that the input cannot be classified reliably.

This is not a weakness in the classifier design. In many production systems, abstention is what prevents the model from turning ambiguity into false certainty.

When abstention belongs in the label set

Add an abstention label when a human expert would sometimes refuse to make the decision from the available evidence.

Common abstention labels include:

unclear
insufficient_evidence
not_applicable
out_of_scope
needs_human_review

The right label depends on what the downstream system needs to know. unclear means the case may be in scope, but the evidence is ambiguous. out_of_scope means the input does not belong in the task at all. needs_human_review means the model should not be the final decision-maker, even if it has a guess.

What abstention is for

Abstention is useful when the cost of a wrong label is higher than the cost of deferring the decision.

Good use cases include:

policy or compliance decisions where false certainty creates risk
routing tasks where the wrong destination is expensive
extraction or classification tasks with incomplete input data
eval judges where the result cannot be determined from the artifact being judged
classifiers that operate on messy user-generated text, logs, tickets, or conversations

In these cases, the goal is not to maximize the number of classified examples. The goal is to maximize the number of correct, useful decisions the system can make without pretending to know more than it does.

What abstention is not for

Abstention should not become a junk drawer for every hard example. If too many cases fall into the abstention label, the task may be underspecified, the labels may overlap, or the model may need more context.

Avoid abstention when the downstream workflow truly requires a best-effort guess. For example, if a routing system must always pick the most likely team and can cheaply recover from mistakes, a ranked or fallback route may be more useful than an abstain label.

The most common failure mode is defining abstention too vaguely. If the model is told to use unclear whenever it is "not sure", the label will behave inconsistently. The instructions should explain what evidence is missing, what ambiguity matters, and when the model should still choose the best available label.

Designing abstention criteria

Treat abstention like any other label: define when it applies, when it does not apply, and what should happen next.

Abstention label	Use when...	Do not use when...
`unclear`	The input is in scope, but the evidence supports multiple plausible labels.	The model can choose a label using the provided rubric.
`insufficient_evidence`	The input is missing information required to make the decision.	The information is present but difficult to interpret.
`not_applicable`	The input does not contain the kind of object the classifier is designed to classify.	The input is relevant but belongs to an uncommon class.
`out_of_scope`	The input belongs to a different task or domain entirely.	The input is in domain but ambiguous.
`needs_human_review`	A wrong automated decision would be costly enough that a human should decide.	The model is merely uncertain in a low-stakes setting.

One useful test: if the model abstains, a human should be able to understand why and know what to do next. If the abstention label does not trigger a clear downstream action, it may be too vague to be useful.