Analytics and Cost Tracking
Fire-and-forget event recording for analytics, plus per-call AI cost tracking and runtime telemetry for prompt caching, reasoning, and context-management visibility.
Analytics and Cost Tracking
The platform tracks two categories of operational data: general analytics events (user actions, feature usage, system events) and AI cost events (token counts, model pricing, per-agent spend). Both systems use a fire-and-forget pattern that never blocks the calling operation.
Overview
Analytics and cost tracking are split across two lightweight modules:
- Analytics (
features/analytics/) --recordAnalyticsEvent()validates events against a Zod registry, then forwards to PostHog. Theanalytics_eventsPostgres table is deprecated (migration20260416100000); PostHog is the single authoritative sink. - Cost (
features/cost/) -- records AI usage costs with model-aware pricing, queries aggregated data, and renders a summary dashboard in the Admin panel.
Both modules use fire-and-forget patterns that never block the caller.
Key Concepts
Analytics
recordAnalyticsEvent<T>(type, payload, tenantId, userId?) -- Validates payload against the Zod schema in features/analytics/events.ts, then calls posthog.capture() via posthog-node. Invalid payloads are dropped with a dev-mode warning; PostHog failures are silent.
The event registry (features/analytics/events.ts) defines 29 v1 events using the resource.verb convention:
- Auth --
auth.signed_in,auth.signed_up,auth.signed_out,auth.tenant_switched,auth.tenant_created - Entity --
entity.created,entity.updated,entity.deleted,entity.viewed - Response --
response.submitted,response.promoted - Chat --
chat.started,chat.message_sent,chat.agent_selected - Tool --
tool.invoked,tool.completed,tool.failed,tool.session_shared,tool.embed_opened,tool.embed_submitted - Task / Session --
task.created,task.session_started,task.session_completed,task.session_failed,task.human_completed - Agent --
agent.dispatched,agent.heartbeat_run - Runtime telemetry --
ai.runtime.completed - Activation --
feature.adopted(8 first-use features; fires once per user/tenant)
Every capture includes groups: { tenant: tenantId } for Group Analytics aggregation.
Client-side identification: features/analytics/identify.ts provides identifyUser, identifyTenantGroup, identifyEmbedSession, resetIdentity, and registerSuperProperties. AnalyticsIdentitySync in the app shell layout wires identity + super-properties on navigation.
Activation tracking: features/analytics/feature-adoption.ts calls trackFeatureAdoption(userId, tenantId, feature). It writes user_feature_adoptions with INSERT ... ON CONFLICT DO NOTHING; only a first-ever insert emits feature.adopted. Wired at 8 sites: entity create, response submit, chat, tool run, task create, agent dispatch, view save, share create.
Server flush: lib/analytics/posthog-server.ts exports flushPostHog() and shutdownPostHog(). Key API routes wrap completion with after(() => flushPostHog()); Inngest steps await flushPostHog() before returning.
DNT honoring: lib/analytics/posthog-provider.tsx initializes PostHog with opt_out_capturing_by_default set from navigator.doNotTrack === "1".
Adding a new event:
- Add a Zod schema to
EVENT_SCHEMASinfeatures/analytics/events.ts. - Add a test case to
features/analytics/events.test.ts. - Fire with
void recordAnalyticsEvent("event.name", payload, tenantId, userId)at the source. - For E2E assertions, use
attachPostHogCapture(page)frome2e/helpers/posthog.ts.
Cost Tracking
CostEvent -- A single AI usage record:
provider-- AI provider name (e.g., "anthropic", "openai")model-- model identifierinput_tokens,output_tokens-- token counts from the API responsecost_cents-- computed cost in cents using model-specific pricingagent_id-- which agent made the call (nullable)chat_id-- which chat session (nullable)source-- context (e.g., "chat", "extraction", "heartbeat")
CostSummary -- Aggregated metrics over a time window:
totalCostCents-- total spendtotalInputTokens,totalOutputTokens-- total tokensdailyBurnRate-- average cost per dayeventCount-- number of API calls
CostTelemetrySummary -- Runtime telemetry aggregated from ai.runtime.completed analytics events:
totalCacheReadTokens,totalCacheWriteTokens,totalReasoningTokenspromptCacheHitCount,promptCacheWriteCountpromptCachingConfiguredCount,contextManagementConfiguredCountappliedContextEditCount
CostByAgent -- Spend grouped by agent ID.
CostByModel -- Spend grouped by model name, enriched with cache-read, cache-write, and reasoning-token totals.
How It Works
Recording Costs
The recordCostEvent() function in features/cost/server/record-cost.ts:
- Looks up the model in the
ai_modelsregistry viagetModelEntry()to get per-million-token pricing - Calculates cost:
(inputTokens / 1M * inputRate) + (outputTokens / 1M * outputRate), rounded to 2 decimal places in cents - Falls back to default rates ($3/1M input, $15/1M output) only when a model is not in the registry, for example external-agent or DB-unavailable fallback paths
- Inserts into
cost_eventstable via admin client - Reports insert failures through the non-fatal error path without blocking the user-facing AI response
This function is called after every tracked streamText() / generateText() / generateObject() / embedding call. Image generation currently records visible zero-token rows until image pricing units are added.
Recording Runtime Telemetry
The shared agent runtime also emits an analytics event after every completed run:
- Event type:
ai.runtime.completed - Metadata:
model,source,agentId,chatId,cacheReadTokens,cacheWriteTokens,reasoningTokens,promptCachingConfigured,contextManagementConfigured,appliedContextEdits
This event is additive to cost_events: spend and raw token accounting remain in the cost table, while provider-specific runtime signals live in analytics.
Querying Cost Data
The getCostData(tenantId, days?, agentId?) function performs a single-pass aggregation:
- Paginates through
cost_eventsfor spend and token totals. - Paginates through
analytics_eventsforai.runtime.completedtelemetry. - Aggregates:
- Total cost, input tokens, output tokens
- Workspace or agent-scoped cache-read, cache-write, reasoning, and context-management metrics
- Per-agent spend
- Per-model spend plus cache/reasoning totals
- Returns
{ summary, telemetry, byAgent, byModel }
A thin getCostSummary() wrapper delegates to getCostData() for callers that only need the top-line totals. The previous getCostByAgent() / getCostByModel() wrappers were deleted — they re-ran the full scan (cost events + runtime telemetry + agent-name join) just to return one property. Call getCostData() once and read the relevant breakdown instead.
Admin Dashboard
The CostSummary component in features/cost/components/cost-summary.tsx renders:
- Five KPI cards -- Total Spend (30d), Daily Burn Rate, Total Tokens, Cache Reuse, Reasoning Tokens
- Runtime Telemetry -- cache writes, prompt-caching configuration count, context-management configuration count, applied context edits, cache-write tokens
- Cost by Agent -- list of agents with call counts and spend
- Cost by Model -- list of models with call counts, spend, cache-read, cache-write, and reasoning totals
The component fetches data from GET /api/costs/summary on mount. Empty state shows a message indicating costs are tracked automatically. Values are formatted with smart suffixes (cents for small amounts, dollars for larger; K/M for tokens).
API Reference
Analytics
| Function | Location | Purpose |
|---|---|---|
recordAnalyticsEvent(type, metadata, tenantId) | features/analytics/record.ts | Fire-and-forget event recording |
Cost Recording
| Function | Location | Purpose |
|---|---|---|
recordCostEvent(input) | features/cost/server/record-cost.ts | Record AI usage with auto-pricing |
Cost Queries
| Function | Location | Purpose |
|---|---|---|
getCostData(tenantId, days?) | features/cost/server/queries.ts | Full cost data: summary + by-agent + by-model + by-driver |
getCostSummary(tenantId, days?) | Same | Summary only (wrapper) |
Types
| Type | Location | Purpose |
|---|---|---|
CostEvent | features/cost/types.ts | Single AI usage record |
CostSummary | Same | Aggregated cost metrics |
CostTelemetrySummary | Same | Aggregated prompt-caching and runtime telemetry metrics |
CostByAgent | Same | Per-agent spend |
CostByModel | Same | Per-model spend plus cache/reasoning totals |
Components
| Component | Location | Purpose |
|---|---|---|
CostSummary | features/cost/components/cost-summary.tsx | Admin cost dashboard with KPIs and breakdowns |
AI Limits and Cost Caps
In addition to tracking costs after the fact, the platform enforces configurable limits before AI calls are made. These limits are resolved per-tenant from tenant_settings (key "ai_limits") and fall back to environment variable defaults when no tenant setting exists.
AiLimits
interface AiLimits {
/** Daily cost cap in cents. 0 = no cap. Default: 5000 ($50). */
dailyCapCents: number;
/** Chat messages per minute per user. 0 = no limit. Default: 30. */
chatRateLimit: number;
/** Max output tokens per model response. 0 = model default. Default: 0. */
maxTokensPerResponse: number;
}Resolving Limits
getAiLimits(tenantId) in features/cost/server/ai-limits.ts reads from the settings cascade:
- Default tenant setting — a system-wide baseline in
tenant_settingsfor the default tenant - Active tenant override — a per-tenant value in
tenant_settings(higher priority, overwrites defaults field-by-field) - Environment variable fallback —
AI_DAILY_CAP_CENTS,CHAT_RATE_LIMIT,AI_MAX_TOKENSwhen no setting row exists
The result is cross-request cached via unstable_cache with tag-based invalidation, so tenant setting changes take effect immediately on the next request after saveTenantSetting() is called.
Daily Cost Cap
checkTenantCostCap(tenantId, capCents) in features/cost/server/limits.ts enforces the daily spend limit:
- If
capCents <= 0, returns{ allowed: true }immediately (no cap configured) - Queries
cost_eventsfor today's total spend (UTC midnight boundary) - Returns
{ allowed, currentCents, capCents } - Results are cached in-process for 5 minutes per tenant to avoid a DB query on every chat message
The check runs inside executeAgent() before any streamText() call. When the cap is exceeded, executeAgent() throws an error with a user-readable message including the current and cap amounts in dollars.
Fail-open policy: If the DB query errors (transient failure, connection issue), checkTenantCostCap returns { allowed: true } rather than blocking AI. This prevents a DB outage from silently disabling all AI features.
Chat Rate Limiting
The chat route (POST /api/chat) checks chatRateLimit before processing each message:
- The limit is per user, per minute (sliding window keyed as
chat:{userId}) - When exceeded, the route returns HTTP 429 with a
Retry-After-style message:"Rate limit exceeded. Try again in Xs." - A value of 0 disables rate limiting entirely
The rate limiter is in-process only — it resets on server restart and is not shared across replicas. For production deployments with multiple server instances, use a Redis-backed rate limiter.
Token Cap
When maxTokensPerResponse > 0, executeAgent() passes maxTokens to streamText(). This limits output token spend per response and prevents runaway generation. A value of 0 defers to the model's own default.
Configuration
Set limits via tenant_settings (key "ai_limits") or environment variables:
| Env var | Default | Purpose |
|---|---|---|
AI_DAILY_CAP_CENTS | 5000 ($50) | Daily total spend cap |
CHAT_RATE_LIMIT | 30 | Chat messages per user per minute |
AI_MAX_TOKENS | 0 (no limit) | Max output tokens per response |
To override for a specific tenant, insert a row into tenant_settings:
INSERT INTO tenant_settings (tenant_id, key, value)
VALUES (
'your-tenant-id',
'ai_limits',
'{"dailyCapCents": 10000, "chatRateLimit": 60, "maxTokensPerResponse": 4096}'
);API Reference — Limits
| Function | Location | Purpose |
|---|---|---|
getAiLimits(tenantId) | features/cost/server/ai-limits.ts | Resolve per-tenant AI limits with cascade + env fallback |
checkTenantCostCap(tenantId, capCents) | features/cost/server/limits.ts | Check today's spend against cap; cached 5 min |
__clearCostCapCache() | Same | Clear in-process cache (test use only) |
For Agents
Agents generate cost events automatically through their AI API calls. Every streamText() call in chat, extraction, and heartbeat records a cost event attributed to the agent's ID.
Agents can query their own cost data through the getUsageStats context tool, which now returns token totals, cost totals, prompt-caching signals, reasoning usage, and top-model breakdowns. This supports self-awareness about both spend and context-efficiency.
There are no agent tools for analytics event recording -- that happens automatically at the platform level.
Related Modules
- Agent System (
features/agents/) -- agent IDs are recorded on cost events;executeAgent()enforces cost cap and token limits - Chat (
features/chat/) -- records cost events after each AI response;POST /api/chatenforces chat rate limit - Extraction (
features/entities/extraction/) -- records cost events per field extraction - Inngest (
features/inngest/) -- heartbeat and background jobs record costs and respect daily cap - Settings (
features/settings/) -- stores"ai_limits"key intenant_settings; provides the settings cascade consumed bygetAiLimits() - Admin -- cost dashboard is rendered in the Admin > Costs tab