Which Cases to Annotate | Sutro Handbook

How many annotations are needed?

This is highly task dependent, but the answer is often fewer than you might think. We've seen as few as 10 annotations make extremely meaningful impacts on task accuracy. Generally speaking, collecting something like 30-50 annotations is sufficient to capture a representative sample for a model to "learn" to perform the task like an expert, especially when used with an automated prompt optimization framework.

How do I find good cases for annotation?

This is an age-old question in machine learning, and significantly impacts overall time spent on annotation as well as accuracy/task learnability. We said we wouldn't advertise much in this handbook, but this is one of the core value offerings of Sutro. We automated the process of edge-case discovery so you can minimize time spent on annotation while maximizing improvements in accuracy.