A Simplified Workflow for Document Processing
Sutro takes the pain away from testing and scaling LLM batch jobs to unblock your most ambitious document digitization projects.
import sutro as so
from pydantic import BaseModel
class ReviewClassifier(BaseModel):
sentiment: str
user_reviews = '.
User_reviews.csv
User_reviews-1.csv
User_reviews-2.csv
User_reviews-3.csv
system_prompt = 'Classify the review as positive, neutral, or negative.'
results = so.infer(user_reviews, system_prompt, output_schema=ReviewClassifier)
Progress: 1% | 1/514,879 | Input tokens processed: 0.41m, Tokens generated: 591k
█░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
Prototype
Start small and iterate fast on your document extraction workflows. Accelerate experiments by testing on a small batch of documents with Sutro before committing to large jobs.
Scale
Scale your document processing workflows so your team can do more in less time. Process billions of tokens from your documents in hours, with no infrastructure headaches or exploding costs.
Integrate
Seamlessly connect Sutro to your existing data workflows. Sutro's Python SDK is compatible with popular data orchestration tools, like Airflow and Dagster, and works with data from any object storage.

Scale Effortlessly
Confidently process millions of documents and billions of tokens at a time. Go from a single page to your entire archive without the pain of managing infrastructure.
Get structured data faster and significantly reduce costs. Sutro parallelizes LLM calls to run batch jobs at a fraction of the cost of other methods.

Go from Documents to Data in Hours, Not Days
Shorten development cycles by getting feedback from large document batches in minutes. Run massive OCR and extraction jobs and get complete results in hours.