From Idea to Millions of Requests, Simplified
Sutro takes the pain away from testing and scaling LLM batch jobs to unblock your most ambitious AI projects.
import sutro as so
from pydantic import BaseModel
class ReviewClassifier(BaseModel):
sentiment: str
user_reviews = '.
User_reviews.csv
User_reviews-1.csv
User_reviews-2.csv
User_reviews-3.csv
system_prompt = 'Classify the review as positive, neutral, or negative.'
results = so.infer(user_reviews, system_prompt, output_schema=ReviewClassifier)
Progress: 1% | 1/514,879 | Input tokens processed: 0.41m, Tokens generated: 591k
█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
Rapidly Prototype
Shorten development cycles by getting feedback from large batch jobs in as little as minutes before scaling up.
Reduce Costs
Get results faster and reduce costs by 10x or more by parallelizing your LLM calls through Sutro.
Scale Effortlessly
Confidently handle millions of requests, and billions of tokens at a time without the pain of managing infrastructure.
Iterate
Start small and iterate fast on your LLM batch workflows. Accelerate experiments by testing on Sutro before committing to large jobs.
Scale your LLM workflows so your team can do more in less time. Process billions of tokens in hours, not days, with no infrastructure headaches or exploding costs.
Progress: 1% | 1/2.5M Rows
█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
Data Orchestrators
Object Storage and Open Data Formats
Notebooks and Pythonic Coding Tools



