Collab Sessions
Live multi-actor state patches on top of the unified sessions primitive — no CRDT transport, no new tables.
Overview
Collab Sessions is the Sprinter Platform's live-collaboration runtime. A workshop — two or more participants co-editing an SOP board, a task DAG, a form, or any future canvas-backed surface — runs inside a single sessions row with session_type = 'mixed'. Every structural edit is a state.patch event appended to session_events and broadcast to every connected participant. No new table, no Yjs transport, no separate "Shared Artifact" primitive.
See ADR-0005 for the full rationale and rejected alternatives.
Key Concepts
Mixed workshop session
A parent mixed session is the workshop container. Its target (entity or view) identifies what's being edited. Per-user drafts and per-interaction tool runs spawn child sessions via parent_session_id, exactly like the chat message pattern.
state.patch event
A session event type carrying a batch of RFC 6902 JSON Patch ops applied to a specific column subtree of a view or entity row. Every patch is validated, applied authoritatively on the server, logged, and broadcast. The session_events table is the authoritative audit log.
// features/collab/types.ts
type StatePatchPayload = {
event_type: "state.patch"
target: { kind: "view" | "entity"; id: string; path: string }
ops: PatchOp[] // RFC 6902 — at most MAX_PATCH_OPS_PER_EVENT
origin: "user" | "agent"
client_id?: string // used for own-echo suppression on broadcast
}Broadcast channel
tenant:<tenantId>:session:<sessionId> on Supabase Realtime. Broadcast carries a StatePatchBroadcast envelope (the validated payload plus server-stamped event_id, sequence, applied_at, actor_id). Clients filter their own echoes by client_id.
Canvas node bindings
A CanvasNodeBinding declares how gestures on a node map to entity field writes. The canvas engine routes gestures through gestureToFieldWrite(binding, gesture), which returns either a FieldWrite (the existing entity update path handles it) or null (layout-only, goes through state.patch on the parent session). Bindings live on node templates and are framework-agnostic.
// features/collab/node-binding.ts
type CanvasNodeBinding = {
entityId?: string
entityTypeSlug?: string
edgesFrom?: { fieldName: string; cardinality?: "one" | "many" }
containerParent?: { fieldName: string }
rankWithinContainer?: { fieldName: string; criteriaSetId?: string }
}Dragging an edge A → B on a node with edgesFrom: { fieldName: 'depends_on' } produces a FieldWrite that appends A to B.depends_on. No new integration glue per surface.
How It Works
client gesture
↓
generate PatchOp[] ──────────────► optimistic local apply
│
▼
POST /api/sessions/[id]/patch
│ (Zod-validated body)
▼
applyCollabPatch(server helper):
1. validate session tenant
2. load target row (views | entities, tenant-scoped)
3. applyPatchToArtifact (pure, RFC 6902)
4. UPDATE row (only the root column of target.path)
5. appendSessionEvent(state.patch)
6. broadcastOnSessionChannel(tenant:t:session:id, state.patch, envelope)
↓
other clients' useCollabPatch receives broadcast, applies ops,
filters own echoes via client_idLast-write-wins per JSON path. The authoritative state is the row in the DB; the event log is the authoritative history; the broadcast is fire-and-forget for latency.
API Reference
Client
useCollabPatch<T>(args)
const { state, apply, status } = useCollabPatch<CanvasState>({
sessionId,
tenantId,
target: { kind: "view", id: viewId, path: "surface_config.canvas_state" },
initial: initialCanvasState,
})
// Optimistically apply an RFC 6902 patch locally and POST to the server.
await apply([{ op: "replace", path: "/nodes/0/position/x", value: 120 }])Reverts local state on server rejection. Subscribes to the session broadcast channel for the lifetime of the component. status reports connecting | live | offline | error.
Reconnect rehydration. When the Realtime channel drops and reconnects (error/offline → live), the hook fetches GET /api/sessions/[id]/events?eventTypes=state.patch&afterSequence=<lastSeen>&limit=500 and applies missed patches in sequence order. Own-client echoes are filtered so non-idempotent ops (array appends, copy, etc.) do not double-apply. This closes the silent-divergence window a naive subscriber would have across a network blip.
Pass initialSequence when the initial snapshot was read at a known sequence — rehydration uses that cursor to avoid refetching events already baked into the snapshot. Omitting it means "replay from 0 on the first reconnect," which is idempotent under LWW but wastes a page fetch.
The hook also returns remoteVersion: number — a monotonic counter that bumps only on remote-origin state changes (channel broadcasts + rehydrate fills). Consumers whose underlying renderer is uncontrolled (React Flow's useNodesState, CodeMirror, Monaco) can use this as a React key to force a remount when canonical state moves out from under them.
Canvas surface adoption (v1.5)
CanvasSurface opts into collab automatically when the caller supplies a collab prop on SurfaceProps:
<CanvasSurface
blocks={blocks}
surfaceConfig={view.surface_config}
collab={{
sessionId: workshopSessionId, // a mixed session
tenantId: activeTenantId,
viewId: view.id,
initialSequence: view.last_event_sequence, // optional
}}
/>When collab is absent the surface renders its original debounced-save path with zero behavior change — /whiteboard, entity-detail canvases, and the workspace editor keep working unchanged. When present, whole-canvas saves become replace / JSON patches and the engine remounts on remoteVersion so remote edits appear immediately.
v1.5 tradeoff — engine reseed on remote patches. The surface keys CanvasEngine on remoteVersion, so every incoming remote patch unmounts and remounts the underlying React Flow instance, reseeding from the new canonical state. Minimal change to keep the current uncontrolled useNodesState/useEdgesState engine viable under multi-actor writes. Cost: transient UI state (current selection, pan, zoom) resets on each remote patch. Fine for SOP-scale workshops where remote patches land a few per minute; a controlled-engine follow-up is captured for high-frequency surfaces.
subscribeToPatchChannel(supabase, ctx, onPatch)
Lower-level channel subscription. Filters own-echoes by ctx.clientId. Returns { close }.
gestureToFieldWrite(binding, gesture)
Pure resolver. Returns a FieldWrite or null.
Server
applyCollabPatch(args) — features/collab/server/apply-patch.ts
Server-authoritative orchestrator. Call this from internal code (future agent applyStatePatch tool, Inngest jobs). The HTTP route POST /api/sessions/[id]/patch delegates here.
const { event_id, sequence, applied_at } = await applyCollabPatch({
sessionId,
tenantId,
actorId: userId,
target,
ops,
origin: "user", // or "agent"
clientId, // optional — omit for agent origin
})Throws on session tenant mismatch, missing row, empty-row update, terminal session status, or a non-mixed session type. CAS exhaustion is split:
CollabRlsDeniedError(status: 403) — every retry saw the sameupdated_atbut the UPDATE still missed. Row is visible under SELECT-RLS but the UPDATE policy rejects this user (e.g. a member hitting an editor+ write policy).CollabContentionError(status: 409) —updated_atadvanced on every retry. Real concurrent-writer storm; caller should back off and retry the user gesture.
apiErrorResponse maps both to their declared HTTP status so clients see 403/409, not a misleading 500.
broadcastOnSessionChannel(ctx, event, payload) — features/realtime/server.ts
Admin-client Supabase Realtime Broadcast wrapper. Fire-and-forget.
For Agents
Agent writes go through the same primitive with origin: "agent". Supervised writes (from chat) inherit the user's permissions. Autonomous writes (heartbeat, Inngest) use the agent's role permissions.
applyStatePatch tool (v3)
Registered by createCollabToolDefinitions() in features/tools/collab-tools.ts. Thin delegate to the server applyCollabPatch helper — same tenant scoping, same CAS, same error-code split (403/409/5xx). Appended session event is stamped origin: 'agent', role: 'agent' so the transcript UI can badge agent writes distinctly from human writes.
applyStatePatch({
sessionId: workshopSessionId,
target: {
kind: "view",
id: viewId,
path: "surface_config.canvas_state",
},
ops: [
{ op: "add", path: "/nodes/-", value: { id: "n-new", position: { x: 200, y: 100 } } },
],
})Scope guard. When options.allowedSessionId is set at tool-construction time, patches to other sessions are rejected before hitting the server. Heartbeat / task executors pin this to the running workshop session so an agent's plan can't spray writes across workshops mid-run.
Groups & admin-client posture. The tool is registered to a single collab group — NOT to the default entity bundle. Callers that want agent live-edits must deliberately include "collab" in the tool-group list once workshop context is loaded. Rationale: applyCollabPatch runs its row UPDATE via the admin Supabase client, which bypasses RLS; the dispatch-time permission check + allowedSessionId are the effective gate today. Both tighten once workshop-membership RBAC (MEDIUM #10) lands and the tool retargets to workshop.team.edit.
Error envelope. Failures return { error, code } where code is one of "denied" (403 — RLS rejected the UPDATE, stop retrying), "contention" (409 — concurrent-writer storm, back off and retry), "out_of_scope" (sessionId is outside allowedSessionId), or "unknown" (5xx or un-tagged error, use generic retry policy). Propagates the structured status from CollabRlsDeniedError / CollabContentionError so agent retry logic can branch on intent, not regex.
Permission (interim). Declares entities.team.update as its coarse gate until the workshop-membership + workshop.team.edit permission lands in PR-v4. Tracked in documents/work/2026-04-23-collab-sessions/followups.md as MEDIUM #10.
When to use vs. existing tools. Use applyStatePatch for structural edits (move a node, add an edge, rearrange a container). Keep updateEntity / submitResponse for content edits on entity fields. The two paths coexist inside one workshop session: the state-patch stream carries layout history; response sessions carry field draft-and-promote flows.
Response-mode views (v4 + v4.1)
Every views row now carries edit_mode text NOT NULL DEFAULT 'direct' with a CHECK constraint of ('direct', 'response'). Default 'direct' preserves today's single-user behaviour for every existing view.
When a view has edit_mode = 'response', collab gestures against that view accumulate as drafts on a per-user child response session (session_type='response', parent_id=<workshop>, status='draft') instead of mutating the canonical view row. Submission + promotion flow through the existing entity_responses pipeline — workshop drafts become proposals that a tenant admin promotes explicitly. The mode is switched at view creation / edit time by the caller that owns that view's lifecycle.
app_permission::workshop.team.edit is reserved for members participating in a workshop session's live-edit stream. Until the workshop-membership table + role grants land, applyStatePatch (v3) uses entities.team.update as a coarse interim gate.
v4.1 routing (shipped)
- Binding field:
CollabBinding.editMode: 'direct' | 'response'(default'direct'); response mode additionally requiresresponseSessionIdon the binding. - Hook:
useCollabPatchbranches oneditMode. Direct mode POSTs to/api/sessions/[id]/patch(unchanged). Response mode POSTs to/api/sessions/[id]/patch-draftwithresponse_session_idin the body. - Server helper:
features/collab/server/apply-patch-draft.ts#applyCollabPatchDraftvalidates the workshop session (mixed, active, same tenant), validates the child response session (session_type='response',parent_id=<workshop>,status='draft', same tenant), and applies the RFC 6902 ops againstdraft_valueswith the same CAS-on-updated_atloop as direct mode. RLS-denied vs. real contention is surfaced asCollabRlsDeniedError(403) vs.CollabContentionError(409), matching the direct path. - Event type:
state.patch.draft— appended on the WORKSHOP session's event log so participants watching the transcript see presence. The event metadata carries the full ops for audit + rehydrate. - Broadcast payload is REDACTED: the presence broadcast on
tenant:<t>:session:<id>carries{event_type, event_id, sequence, applied_at, actor_id, response_session_id, target, client_id}and intentionally stripsopsand values. Drafts stay private to the authoring user until the response is submitted.
Phase 7 adoption chain (shipped)
The three pieces of plumbing that turn edit_mode='response' from an inert column into a live write path:
UnifiedViewRenderer.collabSession— optional prop onfeatures/views/components/unified-view-renderer.tsx. When supplied, the renderer callsbuildCollabBinding(view, collabSession)and threads the result toSurfaceRendererascollab. Absent → single-user rendering. Type:CollabSessionInput = { sessionId, tenantId, responseSessionId?, initialSequence? }. The call is intentionally bare (nouseMemo) — downstream consumers destructure the binding to primitives (sessionId,tenantId,responseSessionId) before using them asuseEffectdeps, so reference instability of the returnedcollabobject does not churn Realtime channel subscriptions. Surfaces that need to hold the binding directly in a hook dep should memoize inside the surface, not at this seam.buildCollabBinding(view, collabSession)— pure helper infeatures/views/lib/build-collab-binding.ts. Returns aCollabBindingResolutiondiscriminated union:{ kind: 'none' }when no workshop session is supplied (single-user render),{ kind: 'binding', binding }when the binding is usable, or{ kind: 'misconfigured', reason: 'missing-response-session-id' }whenedit_mode='response'was requested without aresponseSessionId. On misconfig it also emitsconsole.errornaming the fix and captures a Sentry warning (tags.module='collab', phase='build-binding') so prod incidents are visible. The renderer fails closed on misconfig — forceseditable=falseand strips every write callback before forwarding toSurfaceRendererso canvas writes cannot leak in-progress draft work onto the canonical view state. Unknownedit_modevalues fall back to'direct'; aresponseSessionIdsupplied alongsideedit_mode='direct'is stripped before returning so stale draft ids don't leak into realtime metadata.ensureResponseSession({ workshopSessionId, actorId, tenantId, viewId?, entityId? })— server-only admin-client helper infeatures/responses/server/ensure-response-session.ts. Looks up or lazily creates thesession_type='response'child session for this participant. Pre-checks the workshop row (tenant equality,session_type='mixed', status ∈WORKSHOP_ACCEPTS_DRAFT_STATUSES—running,pending,waiting_human,awaiting_tool,idle; rejectsdraft/ terminal). A partial unique indexsessions_response_draft_per_parent_user_uniqon(parent_id, user_id, tenant_id) WHERE session_type='response' AND status='draft' AND parent_id IS NOT NULLmakes the get-or-create atomic: a concurrent duplicate INSERT raises23505; the race loser re-selects the winner's id and converges on the same draft. Migration includesSET LOCAL lock_timeout = '5s'to avoid blocking hot-pathsessionswrites during deploy.
SECURITY — caller contract. ensureResponseSession uses createAdminClient() (RLS-bypassing) and the helper trusts both actorId and tenantId. Callers MUST resolve them via requireAuth() / getUserId() and getTenantContext() / getActiveTenantId() respectively. Workshop-membership is NOT enforced here — it's a separate follow-up. See the file-level JSDoc for the full contract + @example.
Page-level usage:
// Server component (page / layout)
const { userId } = await requireAuth(); // from features/tenant/auth
const { tenantId } = await getTenantContext(); // from features/tenant/context
const { responseSessionId } = await ensureResponseSession({
workshopSessionId: workshop.id,
actorId: userId,
tenantId,
viewId: view.id,
});
// Pass to UnifiedViewRenderer
<UnifiedViewRenderer
view={view} // carries edit_mode
resolvedBlocks={blocks}
collabSession={{
sessionId: workshop.id,
tenantId,
responseSessionId, // omit in direct-mode workshops
initialSequence: lastSeenSequence,
}}
/>The first real consumer is the v5 SOP program-canvas adoption; that work wires getOrCreateProgramCanvas to return both the workshop session id and the draft session id so the page can call ensureResponseSession + UnifiedViewRenderer in one server component.
Design Decisions
- JSON Patch over Yjs — structural edits at 1-second cadence, not character-level CRDTs. See ADR-0005.
- Server-authoritative HTTP route — one trust boundary, easier to audit. Direct-client broadcast reserved for a later latency optimization.
- Bindings, not
extractSemantics— gestures on entity-backed nodes are field edits with a different UI. No asynchronous projection layer. - One mixed session per workshop — aggregation is a single
parent_session_idfilter. Presence is naturally scoped to the shared room.
Related Modules
- Sessions — the unified session + event log this primitive extends
- Realtime — Supabase Realtime channel conventions (
ChannelType,buildChannelName) - View System — where
surface_config.canvas_statelives - Entity System — where
FieldWrites land - ADR-0005 — full rationale