AI Providers
Multi-provider AI model support via AI SDK v6, including model catalog, cost tracking, provider configuration, and speed tiers.
AI Providers
The Sprinter Platform supports multiple AI providers through the Vercel AI SDK v6. Agents can use models from Anthropic, OpenAI, Google, xAI, and DeepSeek, with per-model cost tracking and speed tier classification.
AI SDK v6
All AI interactions use the AI SDK v6 API surface:
streamText()-- server-side streaming text generation with tool useconvertToModelMessages()-- converts stored message format to SDK-compatible messagesDefaultChatTransport-- client-side transport for streaming chat responsesgenerateObject()-- structured output generation (used by quick capture)
The platform never uses older AI SDK APIs. The messages table stores AI SDK v6 parts (JSONB) for rich content representation including text, tool calls, and tool results.
Model catalog
The runtime model catalog is the DB-backed ai_models table. features/ai-models/catalog.ts contains the fallback seed/catalog used when the DB is unavailable and as source data for migrations.
Supported providers
| Provider | Default model | Env variable |
|---|---|---|
| Anthropic | claude-sonnet-4-6 | ANTHROPIC_API_KEY |
| OpenAI | gpt-4.1 | OPENAI_API_KEY |
gemini-2.5-flash | GOOGLE_GENERATIVE_AI_API_KEY | |
| xAI | grok-3 | XAI_API_KEY |
| DeepSeek | deepseek-chat | DEEPSEEK_API_KEY |
The default chat model for the platform is claude-sonnet-4-6. The default image generation model is gpt-image-1.5.
Model entries
Each model in the catalog has:
interface ModelCatalogEntry {
id: string; // Model ID used in API calls (e.g., "claude-sonnet-4-6")
provider: ModelProvider; // "anthropic" | "openai" | "google" | "xai" | "deepseek"
name: string; // Display name (e.g., "Claude Sonnet 4.6")
description?: string; // Human-readable description
capabilities: ModelCapability[]; // What the model can do
speedTier: "fast" | "medium" | "slow";
reasoningLevel?: "low" | "medium" | "high";
costPer1mInput: number; // Cost per 1M input tokens (USD)
costPer1mOutput: number; // Cost per 1M output tokens (USD)
enabled?: boolean;
}Available models
Anthropic:
- Claude Sonnet 4.5 -- fast, $3/$15 per 1M tokens
- Claude Sonnet 4.6 -- fast, $3/$15, extended thinking support
- Claude Opus 4.6 -- slow, $5/$25, most capable for complex tasks
- Claude Haiku 4.5 -- fast, $1/$5, cost-efficient for high-volume work
OpenAI:
- GPT-4.1 -- medium, $2/$8, strong general-purpose
- GPT-4o -- fast, $2.50/$10, multimodal (image + audio)
- o3 -- slow, $10/$40, advanced reasoning
- o4-mini -- fast, $1.10/$4.40, fast reasoning
Google:
- Gemini 2.5 Pro -- medium, $1.25/$10, multimodal reasoning
- Gemini 2.5 Flash -- fast, $0.30/$2.50, fast and affordable
xAI:
- Grok 3 -- medium, $3/$15
- Grok 3 Mini -- fast, $0.30/$1
DeepSeek:
- DeepSeek Chat V3 -- fast, $0.14/$0.28, strong coding at very low cost
- DeepSeek Reasoner R1 -- medium, $0.55/$2.19, deep reasoning
Image generation:
- GPT Image 1.5, GPT Image 1, DALL-E 3 (OpenAI)
- Gemini 2.5 Flash Image (Google)
Speed tiers
Models are classified into three speed tiers to help users and agents choose the right model for their use case:
| Tier | Typical response time | Best for |
|---|---|---|
fast | Under 5 seconds | Chat, quick extraction, high-volume tasks |
medium | 5-15 seconds | Balanced analysis, document processing |
slow | 15+ seconds | Complex reasoning, multi-step analysis |
DB-managed model catalog
The platform stores the model catalog in the ai_models database table, managed via Admin > Models. Enabled rows are the runtime allowlist for direct provider language, image, and embedding model resolution whenever the DB registry is available. The hardcoded catalog is only a local/DB-unavailable fallback and a bootstrap fallback for deployments that have not yet seeded rows for a capability.
listAiModels()-- returns all models from the DB, falling back to the hardcodedFALLBACK_MODEL_CATALOGif the DB is unavailablelistEnabledAiModelEntries()-- returns only enabled models asModelCatalogEntry[]createAiModel()/updateAiModel()-- CRUD for model records
Each DB model record includes sort_order for display ordering and enabled for toggling availability without deletion. Utility call sites resolve tenant override -> per-call-site env var -> preferred default, then the provider resolver validates/selects the actual enabled registry row and returns the concrete model id used for Langfuse and cost events.
Cost tracking
Every AI interaction records token usage. The tool_runs table tracks duration_ms per tool execution. Agent heartbeat runs record tokensUsed in the agent_heartbeat_runs table.
Model costs are defined per entry in the catalog (costPer1mInput, costPer1mOutput) and can be viewed in the Admin > Costs dashboard.
Agent model selection
Each agent can specify its preferred model in the agents table model column. The chat route uses this model when the agent is selected:
- User selects an agent in the chat sidebar
- Chat route resolves the agent (code registry first, then DB fallback)
- Uses the agent's configured model, or falls back to the registry-backed platform default (preferred model:
claude-sonnet-4-6)
Users can also select models directly in the chat UI via the model picker, which groups models by provider.
Provider utilities
The catalog exports several helper functions and constants:
PROVIDER_ORDER-- canonical display order: Anthropic, OpenAI, Google, xAI, DeepSeekPROVIDER_LABELS-- display names for each providerPROVIDER_DEFAULT_MODEL-- default model per providergroupModelsByProvider(models)-- groups a model array by provider in display order
These are re-exported from features/chat/model-types.ts for use in chat UI components.
Environment setup
At minimum, set the API key for your primary provider:
# Required -- at least one provider
ANTHROPIC_API_KEY=sk-ant-...
# Optional -- additional providers
OPENAI_API_KEY=sk-...
GOOGLE_GENERATIVE_AI_API_KEY=AI...
XAI_API_KEY=xai-...
DEEPSEEK_API_KEY=sk-...
# Web search (used by the webSearch tool)
EXA_API_KEY=...Models from providers without configured API keys will still appear in the catalog but will fail at runtime. The platform does not validate API keys at startup.