Choose a not-bad model
This is going to be one of the shorter sections of this handbook, because at Sutro we believe model selection is one of the least important decisions an AI engineer will make in building a reliable AI system for the vast majority of analytical tasks, and this calculus will become even easier over time.
Foundation Models as Operating Systems
Without logging into AWS, can you tell me the exact linux distribution you used in the most recent VM you launched? Probably not - but you probably can tell me what operating system your laptop runs on (MacOS, Windows, or Linux?)
That's because operating systems became a commodity as they improved, and their core functionality is largely identical for developers. For instance, running a webserver or database on a virtual machine today is likely more of a question of application needs and vertical/horizontal scaling than whether the application can run at all.
Similarly, we believe that as instruction-following and data-understanding improve within foundation models, they'll be able to handle virtually any "learnable" unstructured data processing/mapping task. More of an emphasis will be placed on things like application latency/throughput, scaling, and costs than choice of base model.
Pages in This Section
- Open-Source vs. Closed Models: why analytical AI systems should usually move toward open-source model infrastructure over time.
- Performance Tradeoffs: how to balance model intelligence, latency, throughput, cost, and task reliability.
- Routers and Ensembles: when escalation, voting, or multiple-model approaches are worth the added system complexity.
The practical default is simple: choose one strong enough model, measure it against real task behavior, and only add model-level complexity when the eval loop proves that a simpler setup is insufficient.