Documentation source

AI-Native Operating Layer

A single overview that scores how AI-native a workspace is across eight dimensions and projects the live state of its loops, agents, software factory, spend, and accountability.

## Overview

The **AI-Native Operating Layer** (`/ai-native`, labelled "Operating Layer") answers one
question for a founding team: _how much of this company actually runs as an AI-native,
closed-loop, queryable system — and what's the highest-leverage thing to fix next?_

It is a **projection + composition layer** over existing platform primitives. It adds **no
new storage, no migration, and no parallel system** — it reads goal systems, sessions, the
cost ledger, the automation ratio, and the entity graph, scores them, and deep-links into
the existing detail surfaces (`/goals`, `/sessions`, `/ai-portfolio`) rather than
duplicating them.

## Key concepts

### The eight dimensions

The readiness rubric (the code-resident equivalent of a criteria set) lives in
`features/ai-native/lib/contracts.ts` as `AI_NATIVE_DIMENSIONS`:

| # | Dimension (`key`) | "Good" looks like |
|---|---|---|
| 1 | `operating_system` | Agents own actions, heartbeats, and recurring work — AI runs ops, not just assists. |
| 2 | `closed_loops` | Important processes are closed loops: goal → work → output → evaluation → next iteration. |
| 3 | `queryable_company` | Records, relations, and documents make decisions and provenance queryable. |
| 4 | `software_factory` | Product work runs as a factory: spec → branch/PR → checks → proof → learning. |
| 5 | `no_human_middleware` | Agent-owned work dominates; humans own outcomes, not routing. |
| 6 | `dri_accountability` | Every loop has a directly-responsible owner; blockers/decisions are visible. |
| 7 | `token_max` | Spend is attributed to outcomes with budget discipline. |
| 8 | `operating_advantage` | Throughput compounds: loops improve, cadence is real, velocity trends up. |

### Evidence-capped scoring

Each dimension produces a `raw` score (0-100) from real signals and a `confidence` (0-1)
equal to the fraction of its expected evidence inputs that are actually present. The final
score is:

```
score = round(raw × (0.4 + 0.6 × confidence))
```

So **absent evidence caps both confidence and score** — an empty cost ledger yields
`token_max` confidence `0` and a score held near zero, never an inflated guess. The
overview reports `dataCoverage` (mean confidence) so a low-evidence posture is never
mistaken for a confident high one. This is the "scores are not vibes" invariant, covered
directly by `scorecard.test.ts`.

### Loop sufficiency

`computeLoopSufficiency` (in `lib/loop-sufficiency.ts`) grades one goal system's closed-loop
readiness: five **gate-class (blocking)** checks — metric, work source, action-or-agent,
output contract, eval rubric — plus three **advisory** checks — human gate, budget, owner.
A loop is `ready` (safe to automate) only when every blocking check passes.

## How it works

```
app/(app)/ai-native/page.tsx
  └─ getTenantContext() → tenantId
  └─ buildOperatingLayerSnapshot(tenantId)         // features/ai-native/server
       ├─ gatherOperatingSignals(tenantId)         // existing read helpers, graceful fallbacks
       ├─ computeAiNativeScorecard(signals)         // features/ai-native/lib — pure, deterministic
       └─ build{LoopSummaries,AgentCockpit,TokenMax,SoftwareFactory,Accountability}
  └─ <OperatingLayerView snapshot />               // six-tab client shell
```

`buildOperatingLayerSnapshot` **never throws**: the whole body is guarded so any helper
failure still returns a valid, honest snapshot (empty projections + a low scorecard) — the
page renders an empty operating layer, never a 500.

## API reference

| Function | Module | Purpose |
|---|---|---|
| `computeAiNativeScorecard(signals)` | `lib/scorecard.ts` | Pure deterministic engine → `AiNativeScorecard`. |
| `computeLoopSufficiency(input)` | `lib/loop-sufficiency.ts` | Closed-loop readiness for one goal system. |
| `gatherOperatingSignals(tenantId)` | `server/signals.ts` | Collect `OperatingSignals` from existing primitives. |
| `buildOperatingLayerSnapshot(tenantId)` | `server/operating-snapshot.ts` | Full `OperatingLayerSnapshot` for the page. |

## For agents

The operating layer is read-only today; agents act through the primitives it projects:

- Advance a loop: `runGoalCheckIn` / goal-work tools (`listGoalWork` → `claimGoalWork` →
  `closeGoalWork`).
- Report status so the cockpit reflects it: `reportAgentActivity` (writes
  `sessions.metadata.blocker` / `decision_needed` / `next_step` / `last_summary`).
- Resume from an attention item: `getAgentHandoffPacket`.

## Design decisions

- **No parallel system.** Every number is a projection of an existing primitive; the page
  deep-links into the canonical detail surfaces instead of re-implementing them
  (`.claude/rules/no-parallel-systems.md`).
- **Honest over flattering.** Missing evidence pulls scores down, not up. `withProof`
  counts proof **delivered** (closed work), not merely required — `proofRequired` is always
  set, so it is no evidence discriminator.
- **Fail-soft.** A read failure degrades one tile to empty, never the page.

## Related modules

- [Loops](/docs/features/loops) — goal systems and the closed-loop console the scorecard reads.
- [Automation Ratio](/docs/features/automation-ratio) — the agent-vs-human signal behind "no human middleware".
- [Delegation Readiness](/docs/features/delegation-readiness) — per-task readiness (distinct from this company-level scorecard).
- [Sessions](/docs/features/sessions) — the agent runs the cockpit surfaces.
- [Analytics &amp; Cost](/docs/features/analytics-cost) — the cost ledger behind the token-max tab.

Documentation source

AI-Native Operating Layer

A single overview that scores how AI-native a workspace is across eight dimensions and projects the live state of its loops, agents, software factory, spend, and accountability.

## Overview

The **AI-Native Operating Layer** (`/ai-native`, labelled "Operating Layer") answers one
question for a founding team: _how much of this company actually runs as an AI-native,
closed-loop, queryable system — and what's the highest-leverage thing to fix next?_

It is a **projection + composition layer** over existing platform primitives. It adds **no
new storage, no migration, and no parallel system** — it reads goal systems, sessions, the
cost ledger, the automation ratio, and the entity graph, scores them, and deep-links into
the existing detail surfaces (`/goals`, `/sessions`, `/ai-portfolio`) rather than
duplicating them.

## Key concepts

### The eight dimensions

The readiness rubric (the code-resident equivalent of a criteria set) lives in
`features/ai-native/lib/contracts.ts` as `AI_NATIVE_DIMENSIONS`:

| # | Dimension (`key`) | "Good" looks like |
|---|---|---|
| 1 | `operating_system` | Agents own actions, heartbeats, and recurring work — AI runs ops, not just assists. |
| 2 | `closed_loops` | Important processes are closed loops: goal → work → output → evaluation → next iteration. |
| 3 | `queryable_company` | Records, relations, and documents make decisions and provenance queryable. |
| 4 | `software_factory` | Product work runs as a factory: spec → branch/PR → checks → proof → learning. |
| 5 | `no_human_middleware` | Agent-owned work dominates; humans own outcomes, not routing. |
| 6 | `dri_accountability` | Every loop has a directly-responsible owner; blockers/decisions are visible. |
| 7 | `token_max` | Spend is attributed to outcomes with budget discipline. |
| 8 | `operating_advantage` | Throughput compounds: loops improve, cadence is real, velocity trends up. |

### Evidence-capped scoring

Each dimension produces a `raw` score (0-100) from real signals and a `confidence` (0-1)
equal to the fraction of its expected evidence inputs that are actually present. The final
score is:

```
score = round(raw × (0.4 + 0.6 × confidence))
```

So **absent evidence caps both confidence and score** — an empty cost ledger yields
`token_max` confidence `0` and a score held near zero, never an inflated guess. The
overview reports `dataCoverage` (mean confidence) so a low-evidence posture is never
mistaken for a confident high one. This is the "scores are not vibes" invariant, covered
directly by `scorecard.test.ts`.

### Loop sufficiency

`computeLoopSufficiency` (in `lib/loop-sufficiency.ts`) grades one goal system's closed-loop
readiness: five **gate-class (blocking)** checks — metric, work source, action-or-agent,
output contract, eval rubric — plus three **advisory** checks — human gate, budget, owner.
A loop is `ready` (safe to automate) only when every blocking check passes.

## How it works

```
app/(app)/ai-native/page.tsx
  └─ getTenantContext() → tenantId
  └─ buildOperatingLayerSnapshot(tenantId)         // features/ai-native/server
       ├─ gatherOperatingSignals(tenantId)         // existing read helpers, graceful fallbacks
       ├─ computeAiNativeScorecard(signals)         // features/ai-native/lib — pure, deterministic
       └─ build{LoopSummaries,AgentCockpit,TokenMax,SoftwareFactory,Accountability}
  └─ <OperatingLayerView snapshot />               // six-tab client shell
```

`buildOperatingLayerSnapshot` **never throws**: the whole body is guarded so any helper
failure still returns a valid, honest snapshot (empty projections + a low scorecard) — the
page renders an empty operating layer, never a 500.

## API reference

| Function | Module | Purpose |
|---|---|---|
| `computeAiNativeScorecard(signals)` | `lib/scorecard.ts` | Pure deterministic engine → `AiNativeScorecard`. |
| `computeLoopSufficiency(input)` | `lib/loop-sufficiency.ts` | Closed-loop readiness for one goal system. |
| `gatherOperatingSignals(tenantId)` | `server/signals.ts` | Collect `OperatingSignals` from existing primitives. |
| `buildOperatingLayerSnapshot(tenantId)` | `server/operating-snapshot.ts` | Full `OperatingLayerSnapshot` for the page. |

## For agents

The operating layer is read-only today; agents act through the primitives it projects:

- Advance a loop: `runGoalCheckIn` / goal-work tools (`listGoalWork` → `claimGoalWork` →
  `closeGoalWork`).
- Report status so the cockpit reflects it: `reportAgentActivity` (writes
  `sessions.metadata.blocker` / `decision_needed` / `next_step` / `last_summary`).
- Resume from an attention item: `getAgentHandoffPacket`.

## Design decisions

- **No parallel system.** Every number is a projection of an existing primitive; the page
  deep-links into the canonical detail surfaces instead of re-implementing them
  (`.claude/rules/no-parallel-systems.md`).
- **Honest over flattering.** Missing evidence pulls scores down, not up. `withProof`
  counts proof **delivered** (closed work), not merely required — `proofRequired` is always
  set, so it is no evidence discriminator.
- **Fail-soft.** A read failure degrades one tile to empty, never the page.

## Related modules

- [Loops](/docs/features/loops) — goal systems and the closed-loop console the scorecard reads.
- [Automation Ratio](/docs/features/automation-ratio) — the agent-vs-human signal behind "no human middleware".
- [Delegation Readiness](/docs/features/delegation-readiness) — per-task readiness (distinct from this company-level scorecard).
- [Sessions](/docs/features/sessions) — the agent runs the cockpit surfaces.
- [Analytics &amp; Cost](/docs/features/analytics-cost) — the cost ledger behind the token-max tab.