Confidence Scores

How to use confidence signals, agreement checks, and escalation logic without relying on self-reported model confidence.

Confidence Scores

AI teams often ask models to report out confidence scores alongside some prediction. This is usually symptomatic of their concern of consistency or reliability.

If you find yourself doing this, it probably means you want to reach for other strategies first to increase consistency. However, reporting out calibrated confidence scores can still be extremely useful, especially when used as an escalation measure or to queue for annotation.

Better Sources of Confidence

More useful confidence signals often come from the system around the model:

  • Agreement: do parallel samples converge on the same answer?
  • Evidence: did the model find the facts required to support the answer?
  • Verifier checks: does a separate judge, rule, or retrieval check confirm the output?
  • Logprobs: you can sometimes rely on cumulative logprobs to measure a models confidence in its result. We won't go into detail here, because it's only situationally useful and can conflict with other best-practices we recommend.

These signals are not perfect either, but they are significantly better than self-reported confidence. A model doesn't always know when it doesn't know - especially when eager to help (just like us humans).

What Confidence Is For

Confidence should usually drive routing of results. Use it to decide whether the system should accept an answer, abstain, retry, escalate to a human, or gather more context.

For analytical AI, a confidence score is most valuable when it changes what the system does next.