Model routing
Sadie’s LLM calls never hardcode a vendor model. Every call declares a task class; the router resolves that class to a tier; the tier resolves to a provider + model at runtime. Swap vendors by changing env vars; no application code moves.
Implementation: packages/ai/src/model-router.ts and packages/ai/src/index.ts.
The four tiers
Section titled “The four tiers”| Tier | Intent | Canonical Anthropic | Canonical OpenAI |
|---|---|---|---|
tier0 | Deterministic code, no model needed | ||
tier1 | Frontier-small, high-volume maintenance | claude-haiku-4-5-20251001 | gpt-4o-mini |
tier2 | Workhorse synthesis, interactive chat | claude-sonnet-4-6 | gpt-4o |
tier3 | Deliberate reasoning, expensive | claude-opus-4-7 | o1 |
embeddings | Vector embeddings | text-embedding-3-small |
These are the defaults in DEFAULT_MODELS when a raw ANTHROPIC_API_KEY or OPENAI_API_KEY is set without tier-specific overrides. Anthropic is preferred when both keys are present.
Task class to tier
Section titled “Task class to tier”Every LLM call in the codebase passes a TaskClass to getProviderForTask(taskClass, userPrefs). The task class determines the tier.
| Task class | Tier | Where it fires |
|---|---|---|
source_extraction | tier1 | Wiki compile, per-source theme / entity extraction. |
candidate_page_matching | tier1 | Matching incoming material against existing wiki entries. |
wiki_patch_draft | tier1 | Drafting a candidate patch. |
contradiction_detection | tier1 | Lint: does this source disagree with an existing claim? |
today_card_features | tier1 | Feature extraction for Today ranking. |
preference_normalization | tier1 | Clustering raw preference signals into canonical kinds. |
wiki_lint | tier1 | Wiki lint pass findings. |
wiki_page_create | tier2 | First-time wiki page synthesis. |
today_card_copy | tier2 | Final copy on a Today card. |
brief_generation | tier2 | Brief prose. |
grounded_chat | tier2 | Normal interactive chat. |
studio_rewrite | tier2 | Studio rewrites. |
contradiction_resolution | tier3 | Resolving a detected contradiction into a unified synthesis. |
deep_synthesis_chat | tier3 | Opt-in slow chat mode. |
The full map lives in TASK_TIER_MAP in packages/ai/src/model-router.ts. Task classes are a closed set; adding one requires adding both the type and an entry in the map.
Resolution order
Section titled “Resolution order”getProviderForTask walks five priorities:
- User-supplied API key for their preferred provider (from
sadieSettings.payload.userApiKeys, decrypted server-side). - User’s preferred provider with an env-var key (
ANTHROPIC_API_KEYorOPENAI_API_KEY). AI_FRONTIER_*tier env vars. Each tier has its own_PROVIDER+_MODELpair.- Raw vendor key. Synthesize a provider from
DEFAULT_MODELSfor the resolved tier. - Local stub.
sadie-localdeterministic provider. Development only (NODE_ENV !== "production"), and suppressible viaSADIE_ALLOW_LOCAL_AI_STUB=0.
If none of the above resolve, getLlmProvider throws MissingLlmProviderError. The chat route catches this and surfaces the error to the user as a config-hint message rather than silently falling back.
Tier env vars
Section titled “Tier env vars”All optional. Setting them pins a specific vendor + model per tier.
AI_FRONTIER_SMALL_PROVIDER=anthropicAI_FRONTIER_SMALL_MODEL=claude-haiku-4-5-20251001
AI_FRONTIER_WORKHORSE_PROVIDER=anthropicAI_FRONTIER_WORKHORSE_MODEL=claude-sonnet-4-6
AI_FRONTIER_REASONING_PROVIDER=anthropicAI_FRONTIER_REASONING_MODEL=claude-opus-4-7
AI_FRONTIER_EMBEDDINGS_PROVIDER=openaiAI_FRONTIER_EMBEDDINGS_MODEL=text-embedding-3-smallWhen a tier env var is set but the matching vendor key is not, the router cascades to the next cheaper tier rather than failing. That keeps things running when a tier is partially misconfigured.
Why tiers, not direct model picks
Section titled “Why tiers, not direct model picks”The product surface changes more slowly than the frontier. Sadie’s cost profile is dominated by tier1 (high-volume, low-stakes) and tier2 (interactive). Tier3 is rare and deliberate. Locking a specific model at the call site would mean every vendor swap touches dozens of files; tiers mean one env var update.
The router also enables per-task-class escalation in the future. A low-confidence tier2 output could automatically retry at tier3, though that path is not wired yet.
Logging
Section titled “Logging”Every routed call passes through withUsageLogging → logUsage in packages/ai/src/usage-logger.ts. That produces a UsageRecord with tier, provider, model, latency, token counts, and origin. The current impl is console.log but the type is stable for persisting into a table when needed.
Convenience: routeAndComplete
Section titled “Convenience: routeAndComplete”One-call convenience that resolves the tier, creates the provider, calls completeOnce, and logs usage:
const { text, resolved } = await routeAndComplete( { taskClass: "source_extraction", userId, origin: "/api/compile/wiki" }, { system: "...", messages: [{ role: "user", content: "..." }] },);Use it for background compile work. For streaming chat, call getProviderForTask("grounded_chat", userPrefs).streamChat(...) directly so you can pipe tokens.