Fire-and-forget event recording for analytics, plus per-call AI cost tracking and runtime telemetry for prompt caching, reasoning, and context-management visibility.

Analytics and Cost Tracking

The platform tracks two categories of operational data: general analytics events (user actions, feature usage, system events) and AI cost events (token counts, model pricing, per-agent spend). Both systems use a fire-and-forget pattern that never blocks the calling operation.

Overview

Analytics and cost tracking are split across two lightweight modules:

Analytics (features/analytics/) -- recordAnalyticsEvent() validates events against a Zod registry, then forwards to PostHog. The analytics_events Postgres table is deprecated (migration 20260416100000); PostHog is the single authoritative sink.
Cost (features/cost/) -- records AI usage costs with model-aware pricing, queries aggregated data, and renders a summary dashboard in the Admin panel.

Both modules use fire-and-forget patterns that never block the caller.

Key Concepts

Analytics

recordAnalyticsEvent<T>(type, payload, tenantId, userId?) -- Validates payload against the Zod schema in features/analytics/events.ts, then calls posthog.capture() via posthog-node. Invalid payloads are dropped with a dev-mode warning; PostHog failures are silent.

The event registry (features/analytics/events.ts) defines 29 v1 events using the resource.verb convention:

Auth -- auth.signed_in, auth.signed_up, auth.signed_out, auth.tenant_switched, auth.tenant_created
Entity -- entity.created, entity.updated, entity.deleted, entity.viewed
Response -- response.submitted, response.promoted
Chat -- chat.started, chat.message_sent, chat.agent_selected
Tool -- tool.invoked, tool.completed, tool.failed, tool.session_shared, tool.embed_opened, tool.embed_submitted
Task / Session -- task.created, task.session_started, task.session_completed, task.session_failed, task.human_completed
Agent -- agent.dispatched, agent.heartbeat_run
Runtime telemetry -- ai.runtime.completed
Activation -- feature.adopted (8 first-use features; fires once per user/tenant)

Every capture includes groups: { tenant: tenantId } for Group Analytics aggregation.

Client-side identification: features/analytics/identify.ts provides identifyUser, identifyTenantGroup, identifyEmbedSession, resetIdentity, and registerSuperProperties. AnalyticsIdentitySync in the app shell layout wires identity + super-properties on navigation.

Activation tracking: features/analytics/feature-adoption.ts calls trackFeatureAdoption(userId, tenantId, feature). It writes user_feature_adoptions with INSERT ... ON CONFLICT DO NOTHING; only a first-ever insert emits feature.adopted. Wired at 8 sites: entity create, response submit, chat, tool run, task create, agent dispatch, view save, share create.

Server flush: lib/analytics/posthog-server.ts exports flushPostHog() and shutdownPostHog(). Key API routes wrap completion with after(() => flushPostHog()); Inngest steps await flushPostHog() before returning.

DNT honoring: lib/analytics/posthog-provider.tsx initializes PostHog with opt_out_capturing_by_default set from navigator.doNotTrack === "1".

Adding a new event:

Add a Zod schema to EVENT_SCHEMAS in features/analytics/events.ts.
Add a test case to features/analytics/events.test.ts.
Fire with void recordAnalyticsEvent("event.name", payload, tenantId, userId) at the source.
For E2E assertions, use attachPostHogCapture(page) from e2e/helpers/posthog.ts.

Cost Tracking

CostEvent -- A single AI usage record:

provider -- AI provider name (e.g., "anthropic", "openai")
model -- model identifier
input_tokens, output_tokens -- token counts from the API response
cost_cents -- computed cost in cents using model-specific pricing
agent_id -- which agent made the call (nullable)
chat_id -- which chat session (nullable)
source -- context (e.g., "chat", "extraction", "heartbeat")

CostSummary -- Aggregated metrics over a time window:

totalCostCents -- total spend
totalInputTokens, totalOutputTokens -- total tokens
dailyBurnRate -- average cost per day
eventCount -- number of API calls

CostTelemetrySummary -- Runtime telemetry aggregated from ai.runtime.completed analytics events:

totalCacheReadTokens, totalCacheWriteTokens, totalReasoningTokens
promptCacheHitCount, promptCacheWriteCount
promptCachingConfiguredCount, contextManagementConfiguredCount
appliedContextEditCount

CostByAgent -- Spend grouped by agent ID.

CostByModel -- Spend grouped by model name, enriched with cache-read, cache-write, and reasoning-token totals.

How It Works

Recording Costs

The recordCostEvent() function in features/cost/server/record-cost.ts:

Looks up the model in the ai_models registry via getModelEntry() to get per-million-token pricing
Calculates cost: (inputTokens / 1M * inputRate) + (outputTokens / 1M * outputRate), rounded to 2 decimal places in cents
Falls back to default rates ($3/1M input, $15/1M output) only when a model is not in the registry, for example external-agent or DB-unavailable fallback paths
Inserts into cost_events table via admin client
Reports insert failures through the non-fatal error path without blocking the user-facing AI response

This function is called after every tracked streamText() / generateText() / generateObject() / embedding call. Image generation currently records visible zero-token rows until image pricing units are added.

Recording Runtime Telemetry

The shared agent runtime also emits an analytics event after every completed run:

Event type: ai.runtime.completed
Metadata: model, source, agentId, chatId, cacheReadTokens, cacheWriteTokens, reasoningTokens, promptCachingConfigured, contextManagementConfigured, appliedContextEdits

This event is additive to cost_events: spend and raw token accounting remain in the cost table, while provider-specific runtime signals live in analytics.

Querying Cost Data

The getCostData(tenantId, days?, agentId?) function performs a single-pass aggregation:

Paginates through cost_events for spend and token totals.
Paginates through analytics_events for ai.runtime.completed telemetry.
Aggregates:
- Total cost, input tokens, output tokens
- Workspace or agent-scoped cache-read, cache-write, reasoning, and context-management metrics
- Per-agent spend
- Per-model spend plus cache/reasoning totals
Returns { summary, telemetry, byAgent, byModel }

A thin getCostSummary() wrapper delegates to getCostData() for callers that only need the top-line totals. The previous getCostByAgent() / getCostByModel() wrappers were deleted — they re-ran the full scan (cost events + runtime telemetry + agent-name join) just to return one property. Call getCostData() once and read the relevant breakdown instead.

Admin Dashboard

The CostSummary component in features/cost/components/cost-summary.tsx renders:

Five KPI cards -- Total Spend (30d), Daily Burn Rate, Total Tokens, Cache Reuse, Reasoning Tokens
Runtime Telemetry -- cache writes, prompt-caching configuration count, context-management configuration count, applied context edits, cache-write tokens
Cost by Agent -- list of agents with call counts and spend
Cost by Model -- list of models with call counts, spend, cache-read, cache-write, and reasoning totals

The component fetches data from GET /api/costs/summary on mount. Empty state shows a message indicating costs are tracked automatically. Values are formatted with smart suffixes (cents for small amounts, dollars for larger; K/M for tokens).

API Reference

Analytics

Function	Location	Purpose
`recordAnalyticsEvent(type, metadata, tenantId)`	`features/analytics/record.ts`	Fire-and-forget event recording

Cost Recording

Function	Location	Purpose
`recordCostEvent(input)`	`features/cost/server/record-cost.ts`	Record AI usage with auto-pricing

Cost Queries

Function	Location	Purpose
`getCostData(tenantId, days?)`	`features/cost/server/queries.ts`	Full cost data: summary + by-agent + by-model + by-driver
`getCostSummary(tenantId, days?)`	Same	Summary only (wrapper)

Types

Type	Location	Purpose
`CostEvent`	`features/cost/types.ts`	Single AI usage record
`CostSummary`	Same	Aggregated cost metrics
`CostTelemetrySummary`	Same	Aggregated prompt-caching and runtime telemetry metrics
`CostByAgent`	Same	Per-agent spend
`CostByModel`	Same	Per-model spend plus cache/reasoning totals

Components

Component	Location	Purpose
`CostSummary`	`features/cost/components/cost-summary.tsx`	Admin cost dashboard with KPIs and breakdowns

AI Limits and Cost Caps

In addition to tracking costs after the fact, the platform enforces configurable limits before AI calls are made. These limits are resolved per-tenant from tenant_settings (key "ai_limits") and fall back to environment variable defaults when no tenant setting exists.

AiLimits

interface AiLimits {
  /** Daily cost cap in cents. 0 = no cap. Default: 5000 ($50). */
  dailyCapCents: number;
  /** Chat messages per minute per user. 0 = no limit. Default: 30. */
  chatRateLimit: number;
  /** Max output tokens per model response. 0 = model default. Default: 0. */
  maxTokensPerResponse: number;
}

Resolving Limits

getAiLimits(tenantId) in features/cost/server/ai-limits.ts reads from the settings cascade:

Default tenant setting — a system-wide baseline in tenant_settings for the default tenant
Active tenant override — a per-tenant value in tenant_settings (higher priority, overwrites defaults field-by-field)
Environment variable fallback — AI_DAILY_CAP_CENTS, CHAT_RATE_LIMIT, AI_MAX_TOKENS when no setting row exists

The result is cross-request cached via unstable_cache with tag-based invalidation, so tenant setting changes take effect immediately on the next request after saveTenantSetting() is called.

Daily Cost Cap

checkTenantCostCap(tenantId, capCents) in features/cost/server/limits.ts enforces the daily spend limit:

If capCents <= 0, returns { allowed: true } immediately (no cap configured)
Queries cost_events for today's total spend (UTC midnight boundary)
Returns { allowed, currentCents, capCents }
Results are cached in-process for 5 minutes per tenant to avoid a DB query on every chat message

The check runs inside executeAgent() before any streamText() call. When the cap is exceeded, executeAgent() throws an error with a user-readable message including the current and cap amounts in dollars.

Fail-open policy: If the DB query errors (transient failure, connection issue), checkTenantCostCap returns { allowed: true } rather than blocking AI. This prevents a DB outage from silently disabling all AI features.

Chat Rate Limiting

The chat route (POST /api/chat) checks chatRateLimit before processing each message:

The limit is per user, per minute (sliding window keyed as chat:{userId})
When exceeded, the route returns HTTP 429 with a Retry-After-style message: "Rate limit exceeded. Try again in Xs."
A value of 0 disables rate limiting entirely

The rate limiter is in-process only — it resets on server restart and is not shared across replicas. For production deployments with multiple server instances, use a Redis-backed rate limiter.

Token Cap

When maxTokensPerResponse > 0, executeAgent() passes maxTokens to streamText(). This limits output token spend per response and prevents runaway generation. A value of 0 defers to the model's own default.

Configuration

Set limits via tenant_settings (key "ai_limits") or environment variables:

Env var	Default	Purpose
`AI_DAILY_CAP_CENTS`	`5000` ($50)	Daily total spend cap
`CHAT_RATE_LIMIT`	`30`	Chat messages per user per minute
`AI_MAX_TOKENS`	`0` (no limit)	Max output tokens per response

To override for a specific tenant, insert a row into tenant_settings:

INSERT INTO tenant_settings (tenant_id, key, value)
VALUES (
  'your-tenant-id',
  'ai_limits',
  '{"dailyCapCents": 10000, "chatRateLimit": 60, "maxTokensPerResponse": 4096}'
);

API Reference — Limits

Function	Location	Purpose
`getAiLimits(tenantId)`	`features/cost/server/ai-limits.ts`	Resolve per-tenant AI limits with cascade + env fallback
`checkTenantCostCap(tenantId, capCents)`	`features/cost/server/limits.ts`	Check today's spend against cap; cached 5 min
`__clearCostCapCache()`	Same	Clear in-process cache (test use only)

For Agents

Agents generate cost events automatically through their AI API calls. Every streamText() call in chat, extraction, and heartbeat records a cost event attributed to the agent's ID.

Agents can query their own cost data through the getUsageStats context tool, which now returns token totals, cost totals, prompt-caching signals, reasoning usage, and top-model breakdowns. This supports self-awareness about both spend and context-efficiency.

There are no agent tools for analytics event recording -- that happens automatically at the platform level.

Agent System (features/agents/) -- agent IDs are recorded on cost events; executeAgent() enforces cost cap and token limits
Chat (features/chat/) -- records cost events after each AI response; POST /api/chat enforces chat rate limit
Extraction (features/entities/extraction/) -- records cost events per field extraction
Inngest (features/inngest/) -- heartbeat and background jobs record costs and respect daily cap
Settings (features/settings/) -- stores "ai_limits" key in tenant_settings; provides the settings cascade consumed by getAiLimits()
Admin -- cost dashboard is rendered in the Admin > Costs tab

Analytics and Cost Tracking

On this page