Documentation source
Collab Sessions
Live multi-actor state patches on top of the unified sessions primitive — no CRDT transport, no new tables.
## Overview
Collab Sessions is the Sprinter Platform's live-collaboration runtime. A workshop — two or more participants co-editing an SOP board, a task DAG, a form, or any future canvas-backed surface — runs inside a single `sessions` row with `session_type = 'mixed'`. Every structural edit is a `state.patch` event appended to `session_events` and broadcast to every connected participant. No new table, no Yjs transport, no separate "Shared Artifact" primitive.
See [ADR-0005](/adr/0005-collab-sessions-and-state-patches) for the full rationale and rejected alternatives.
## Key Concepts
### Mixed workshop session
A parent `mixed` session is the workshop container. Its `target` (entity or view) identifies what's being edited. Per-user drafts and per-interaction tool runs spawn child sessions via `parent_session_id`, exactly like the chat message pattern.
### `state.patch` event
A session event type carrying a batch of [RFC 6902 JSON Patch](https://datatracker.ietf.org/doc/html/rfc6902) ops applied to a specific column subtree of a view or entity row. Every patch is validated, applied authoritatively on the server, logged, and broadcast. The `session_events` table is the authoritative audit log.
```ts
// features/collab/types.ts
type StatePatchPayload = {
event_type: "state.patch"
target: { kind: "view" | "entity"; id: string; path: string }
ops: PatchOp[] // RFC 6902 — at most MAX_PATCH_OPS_PER_EVENT
origin: "user" | "agent"
client_id?: string // used for own-echo suppression on broadcast
}
```
### Broadcast channel
`tenant:<tenantId>:session:<sessionId>` on Supabase Realtime. Broadcast carries a `StatePatchBroadcast` envelope (the validated payload plus server-stamped `event_id`, `sequence`, `applied_at`, `actor_id`). Clients filter their own echoes by `client_id`.
### Canvas node bindings
A `CanvasNodeBinding` declares how gestures on a node map to entity field writes. The canvas engine routes gestures through `gestureToFieldWrite(binding, gesture)`, which returns either a `FieldWrite` (the existing entity update path handles it) or `null` (layout-only, goes through `state.patch` on the parent session). Bindings live on node templates and are framework-agnostic.
```ts
// features/collab/node-binding.ts
type CanvasNodeBinding = {
entityId?: string
entityTypeSlug?: string
edgesFrom?: { fieldName: string; cardinality?: "one" | "many" }
containerParent?: { fieldName: string }
rankWithinContainer?: { fieldName: string; criteriaSetId?: string }
}
```
Dragging an edge `A → B` on a node with `edgesFrom: { fieldName: 'depends_on' }` produces a `FieldWrite` that appends `A` to `B.depends_on`. No new integration glue per surface.
## How It Works
```
client gesture
↓
generate PatchOp[] ──────────────► optimistic local apply
│
▼
POST /api/sessions/[id]/patch
│ (Zod-validated body)
▼
applyCollabPatch(server helper):
1. validate session tenant
2. load target row (views | entities, tenant-scoped)
3. applyPatchToArtifact (pure, RFC 6902)
4. UPDATE row (only the root column of target.path)
5. appendSessionEvent(state.patch)
6. broadcastOnSessionChannel(tenant:t:session:id, state.patch, envelope)
↓
other clients' useCollabPatch receives broadcast, applies ops,
filters own echoes via client_id
```
Last-write-wins per JSON path. The authoritative state is the row in the DB; the event log is the authoritative history; the broadcast is fire-and-forget for latency.
## API Reference
### Client
#### `useCollabPatch<T>(args)`
```ts
const { state, apply, status } = useCollabPatch<CanvasState>({
sessionId,
tenantId,
target: { kind: "view", id: viewId, path: "surface_config.canvas_state" },
initial: initialCanvasState,
})
// Optimistically apply an RFC 6902 patch locally and POST to the server.
await apply([{ op: "replace", path: "/nodes/0/position/x", value: 120 }])
```
Reverts local state on server rejection. Subscribes to the session broadcast channel for the lifetime of the component. `status` reports `connecting | live | offline | error`.
**Reconnect rehydration.** When the Realtime channel drops and reconnects (`error`/`offline` → `live`), the hook fetches `GET /api/sessions/[id]/events?eventTypes=state.patch&afterSequence=<lastSeen>&limit=500` and applies missed patches in sequence order. Own-client echoes are filtered so non-idempotent ops (array appends, `copy`, etc.) do not double-apply. This closes the silent-divergence window a naive subscriber would have across a network blip.
Pass `initialSequence` when the `initial` snapshot was read at a known sequence — rehydration uses that cursor to avoid refetching events already baked into the snapshot. Omitting it means "replay from 0 on the first reconnect," which is idempotent under LWW but wastes a page fetch.
The hook also returns `remoteVersion: number` — a monotonic counter that bumps only on remote-origin state changes (channel broadcasts + rehydrate fills). Consumers whose underlying renderer is uncontrolled (React Flow's `useNodesState`, CodeMirror, Monaco) can use this as a React `key` to force a remount when canonical state moves out from under them.
### Canvas surface adoption (v1.5)
`CanvasSurface` opts into collab automatically when the caller supplies a `collab` prop on `SurfaceProps`:
```tsx
<CanvasSurface
blocks={blocks}
surfaceConfig={view.surface_config}
collab={{
sessionId: workshopSessionId, // a mixed session
tenantId: activeTenantId,
viewId: view.id,
initialSequence: view.last_event_sequence, // optional
}}
/>
```
When `collab` is absent the surface renders its original debounced-save path with zero behavior change — `/whiteboard`, entity-detail canvases, and the workspace editor keep working unchanged. When present, whole-canvas saves become `replace /` JSON patches and the engine remounts on `remoteVersion` so remote edits appear immediately.
**v1.5 tradeoff — engine reseed on remote patches.** The surface keys `CanvasEngine` on `remoteVersion`, so every incoming remote patch unmounts and remounts the underlying React Flow instance, reseeding from the new canonical state. Minimal change to keep the current uncontrolled `useNodesState`/`useEdgesState` engine viable under multi-actor writes. Cost: transient UI state (current selection, pan, zoom) resets on each remote patch. Fine for SOP-scale workshops where remote patches land a few per minute; a controlled-engine follow-up is captured for high-frequency surfaces.
#### `subscribeToPatchChannel(supabase, ctx, onPatch)`
Lower-level channel subscription. Filters own-echoes by `ctx.clientId`. Returns `{ close }`.
#### `gestureToFieldWrite(binding, gesture)`
Pure resolver. Returns a `FieldWrite` or `null`.
### Server
#### `applyCollabPatch(args)` — `features/collab/server/apply-patch.ts`
Server-authoritative orchestrator. Call this from internal code (future agent `applyStatePatch` tool, Inngest jobs). The HTTP route `POST /api/sessions/[id]/patch` delegates here.
```ts
const { event_id, sequence, applied_at } = await applyCollabPatch({
sessionId,
tenantId,
actorId: userId,
target,
ops,
origin: "user", // or "agent"
clientId, // optional — omit for agent origin
})
```
Throws on session tenant mismatch, missing row, empty-row update, terminal session status, or a non-`mixed` session type. CAS exhaustion is split:
- `CollabRlsDeniedError` (`status: 403`) — every retry saw the same `updated_at` but the UPDATE still missed. Row is visible under SELECT-RLS but the UPDATE policy rejects this user (e.g. a member hitting an editor+ write policy).
- `CollabContentionError` (`status: 409`) — `updated_at` advanced on every retry. Real concurrent-writer storm; caller should back off and retry the user gesture.
`apiErrorResponse` maps both to their declared HTTP status so clients see 403/409, not a misleading 500.
#### `broadcastOnSessionChannel(ctx, event, payload)` — `features/realtime/server.ts`
Admin-client Supabase Realtime Broadcast wrapper. Fire-and-forget.
## For Agents
Agent writes go through the same primitive with `origin: "agent"`. Supervised writes (from chat) inherit the user's permissions. Autonomous writes (heartbeat, Inngest) use the agent's role permissions.
### `applyStatePatch` tool (v3)
Registered by `createCollabToolDefinitions()` in `features/tools/collab-tools.ts`. Thin delegate to the server `applyCollabPatch` helper — same tenant scoping, same CAS, same error-code split (403/409/5xx). Appended session event is stamped `origin: 'agent'`, `role: 'agent'` so the transcript UI can badge agent writes distinctly from human writes.
```ts
applyStatePatch({
sessionId: workshopSessionId,
target: {
kind: "view",
id: viewId,
path: "surface_config.canvas_state",
},
ops: [
{ op: "add", path: "/nodes/-", value: { id: "n-new", position: { x: 200, y: 100 } } },
],
})
```
**Scope guard.** When `options.allowedSessionId` is set at tool-construction time, patches to other sessions are rejected **before hitting the server**. Heartbeat / task executors pin this to the running workshop session so an agent's plan can't spray writes across workshops mid-run.
**Groups & admin-client posture.** The tool is registered to a single `collab` group — NOT to the default `entity` bundle. Callers that want agent live-edits must deliberately include `"collab"` in the tool-group list once workshop context is loaded. Rationale: `applyCollabPatch` runs its row UPDATE via the admin Supabase client, which bypasses RLS; the dispatch-time permission check + `allowedSessionId` are the effective gate today. Both tighten once workshop-membership RBAC (MEDIUM #10) lands and the tool retargets to `workshop.team.edit`.
**Error envelope.** Failures return `{ error, code }` where `code` is one of `"denied"` (403 — RLS rejected the UPDATE, stop retrying), `"contention"` (409 — concurrent-writer storm, back off and retry), `"out_of_scope"` (sessionId is outside `allowedSessionId`), or `"unknown"` (5xx or un-tagged error, use generic retry policy). Propagates the structured `status` from `CollabRlsDeniedError` / `CollabContentionError` so agent retry logic can branch on intent, not regex.
**Permission (interim).** Declares `entities.team.update` as its coarse gate until the workshop-membership + `workshop.team.edit` permission lands in PR-v4. Tracked in `documents/work/2026-04-23-collab-sessions/followups.md` as MEDIUM #10.
**When to use vs. existing tools.** Use `applyStatePatch` for *structural* edits (move a node, add an edge, rearrange a container). Keep `updateEntity` / `submitResponse` for content edits on entity fields. The two paths coexist inside one workshop session: the state-patch stream carries layout history; response sessions carry field draft-and-promote flows.
## Response-mode views (v4 + v4.1)
Every `views` row now carries `edit_mode text NOT NULL DEFAULT 'direct'` with a CHECK constraint of `('direct', 'response')`. Default `'direct'` preserves today's single-user behaviour for every existing view.
When a view has `edit_mode = 'response'`, collab gestures against that view accumulate as drafts on a per-user child response session (`session_type='response'`, `parent_id=<workshop>`, `status='draft'`) instead of mutating the canonical view row. Submission + promotion flow through the existing `entity_responses` pipeline — workshop drafts become proposals that a tenant admin promotes explicitly. The mode is switched at view creation / edit time by the caller that owns that view's lifecycle.
`app_permission::workshop.team.edit` is reserved for members participating in a workshop session's live-edit stream. Until the workshop-membership table + role grants land, `applyStatePatch` (v3) uses `entities.team.update` as a coarse interim gate.
### v4.1 routing (shipped)
- **Binding field**: `CollabBinding.editMode: 'direct' | 'response'` (default `'direct'`); response mode additionally requires `responseSessionId` on the binding.
- **Hook**: `useCollabPatch` branches on `editMode`. Direct mode POSTs to `/api/sessions/[id]/patch` (unchanged). Response mode POSTs to `/api/sessions/[id]/patch-draft` with `response_session_id` in the body.
- **Server helper**: `features/collab/server/apply-patch-draft.ts#applyCollabPatchDraft` validates the workshop session (`mixed`, active, same tenant), validates the child response session (`session_type='response'`, `parent_id=<workshop>`, `status='draft'`, same tenant), and applies the RFC 6902 ops against `draft_values` with the same CAS-on-`updated_at` loop as direct mode. RLS-denied vs. real contention is surfaced as `CollabRlsDeniedError` (403) vs. `CollabContentionError` (409), matching the direct path.
- **Event type**: `state.patch.draft` — appended on the WORKSHOP session's event log so participants watching the transcript see presence. The event metadata carries the full ops for audit + rehydrate.
- **Broadcast payload is REDACTED**: the presence broadcast on `tenant:<t>:session:<id>` carries `{event_type, event_id, sequence, applied_at, actor_id, response_session_id, target, client_id}` and intentionally **strips `ops` and values**. Drafts stay private to the authoring user until the response is submitted.
### Phase 7 adoption chain (shipped)
The three pieces of plumbing that turn `edit_mode='response'` from an inert column into a live write path:
- **`UnifiedViewRenderer.collabSession`** — optional prop on `features/views/components/unified-view-renderer.tsx`. When supplied, the renderer calls `buildCollabBinding(view, collabSession)` and threads the result to `SurfaceRenderer` as `collab`. Absent → single-user rendering. Type: `CollabSessionInput = { sessionId, tenantId, responseSessionId?, initialSequence? }`. The call is intentionally bare (no `useMemo`) — downstream consumers destructure the binding to primitives (`sessionId`, `tenantId`, `responseSessionId`) before using them as `useEffect` deps, so reference instability of the returned `collab` object does not churn Realtime channel subscriptions. Surfaces that need to hold the binding directly in a hook dep should memoize inside the surface, not at this seam.
- **`buildCollabBinding(view, collabSession)`** — pure helper in `features/views/lib/build-collab-binding.ts`. Returns a `CollabBindingResolution` discriminated union: `{ kind: 'none' }` when no workshop session is supplied (single-user render), `{ kind: 'binding', binding }` when the binding is usable, or `{ kind: 'misconfigured', reason: 'missing-response-session-id' }` when `edit_mode='response'` was requested without a `responseSessionId`. On misconfig it also emits `console.error` naming the fix and captures a Sentry warning (`tags.module='collab', phase='build-binding'`) so prod incidents are visible. **The renderer fails closed on misconfig** — forces `editable=false` and strips every write callback before forwarding to `SurfaceRenderer` so canvas writes cannot leak in-progress draft work onto the canonical view state. Unknown `edit_mode` values fall back to `'direct'`; a `responseSessionId` supplied alongside `edit_mode='direct'` is stripped before returning so stale draft ids don't leak into realtime metadata.
- **`ensureResponseSession({ workshopSessionId, actorId, tenantId, viewId?, entityId? })`** — server-only admin-client helper in `features/responses/server/ensure-response-session.ts`. Looks up or lazily creates the `session_type='response'` child session for this participant. Pre-checks the workshop row (tenant equality, `session_type='mixed'`, status ∈ `WORKSHOP_ACCEPTS_DRAFT_STATUSES` — `running`, `pending`, `waiting_human`, `awaiting_tool`, `idle`; rejects `draft` / terminal). A partial unique index `sessions_response_draft_per_parent_user_uniq` on `(parent_id, user_id, tenant_id) WHERE session_type='response' AND status='draft' AND parent_id IS NOT NULL` makes the get-or-create atomic: a concurrent duplicate INSERT raises `23505`; the race loser re-selects the winner's id and converges on the same draft. Migration includes `SET LOCAL lock_timeout = '5s'` to avoid blocking hot-path `sessions` writes during deploy.
**SECURITY — caller contract.** `ensureResponseSession` uses `createAdminClient()` (RLS-bypassing) and the helper trusts both `actorId` and `tenantId`. Callers MUST resolve them via `requireAuth()` / `getUserId()` and `getTenantContext()` / `getActiveTenantId()` respectively. Workshop-membership is NOT enforced here — it's a separate follow-up. See the file-level JSDoc for the full contract + `@example`.
**Page-level usage:**
```tsx
// Server component (page / layout)
const { userId } = await requireAuth(); // from features/tenant/auth
const { tenantId } = await getTenantContext(); // from features/tenant/context
const { responseSessionId } = await ensureResponseSession({
workshopSessionId: workshop.id,
actorId: userId,
tenantId,
viewId: view.id,
});
// Pass to UnifiedViewRenderer
<UnifiedViewRenderer
view={view} // carries edit_mode
resolvedBlocks={blocks}
collabSession={{
sessionId: workshop.id,
tenantId,
responseSessionId, // omit in direct-mode workshops
initialSequence: lastSeenSequence,
}}
/>
```
The first real consumer is the v5 SOP program-canvas adoption; that work wires `getOrCreateProgramCanvas` to return both the workshop session id and the draft session id so the page can call `ensureResponseSession` + `UnifiedViewRenderer` in one server component.
## Design Decisions
- **JSON Patch over Yjs** — structural edits at 1-second cadence, not character-level CRDTs. See ADR-0005.
- **Server-authoritative HTTP route** — one trust boundary, easier to audit. Direct-client broadcast reserved for a later latency optimization.
- **Bindings, not `extractSemantics`** — gestures on entity-backed nodes _are_ field edits with a different UI. No asynchronous projection layer.
- **One mixed session per workshop** — aggregation is a single `parent_session_id` filter. Presence is naturally scoped to the shared room.
## Related Modules
- [Sessions](/docs/features/sessions) — the unified session + event log this primitive extends
- [Realtime](/docs/features/realtime) — Supabase Realtime channel conventions (`ChannelType`, `buildChannelName`)
- [View System](/docs/features/view-system) — where `surface_config.canvas_state` lives
- [Entity System](/docs/features/entity-system) — where `FieldWrite`s land
- [ADR-0005](/adr/0005-collab-sessions-and-state-patches) — full rationale