Documentation source
Company Brain
A source-backed, retrieval-optimized knowledge graph that external agents query over MCP — canonical node types, fused retrieval via getContextPack, and a goal-loop ingestion contract.
# Company Brain
The **company brain** is not a new system. It is the Amble entity graph used as a durable,
source-backed context layer that agents — in-app and external (via MCP) — query to understand who
the user is, what they're working on, who and what matters, what was decided, what's open, and
which sources justify each conclusion.
It is assembled from primitives the platform already owns:
- **Nodes** = `entities` rows; **node kinds** = `entity_types` (DB-driven `json_schema`).
- **Edges** = typed `entity_relations` (graph traversal + provenance links).
- **Documents** = `document` entities + chunked, embedded `document_chunks` (hybrid RAG).
- **Retrieval** = live pgvector + Postgres FTS + graph, fused by `getContextPack`.
No parallel store, no new engine. See `.claude/rules/no-parallel-systems.md`.
## The `@amble/company-brain` bundle
Install the bundle (`features/custom/bundles/company-brain/manifest.json`) to seed a tenant with
the canonical knowledge-graph node types. Each carries provenance, confidence, freshness, and
sensitivity in its `json_schema` so retrieval can rank by trust and recency.
| Type | What it holds |
| --- | --- |
| `person` | People — relationship, role, preferences, open loops |
| `organization` | Companies / orgs / teams |
| `project` | Active, time-bound work with a definition-of-done |
| `decision` | Statement, rationale, alternatives, consequences, status |
| `note` | Atomic, concept-oriented note (`kind`: insight / preference / procedure / meeting / commitment / report / state / source-manifest / validation-report / completion-audit) |
| `topic` | Concept/theme node for concept-oriented linking |
| `context-pack` | A curated retrieval bundle for a bounded task — "use this when…" + agent instructions |
| `source` | Provenance anchor (document / url / integration / manual) that records cite |
| `source-item` | An individual captured item (email thread, article, message) |
**Canonical edges:** `person —works_at→ organization`, `person —involved_in→ project`,
`project —produced_decision→ decision`, `decision —sourced_from→ source`, `note —about→ topic`,
`note —mentions→ (person|organization|project)`, `(any) —sourced_from→ source`,
`context-pack —includes→ (any)`.
**Provenance field group** (on every claim-bearing type): `summary`, `aliases[]`, `lifecycle`
(active / area / resource / archived — PARA-style relevance routing), `confidence`
(high / medium / low), `source`, `source_url`, `last_verified_at`, `sensitivity`
(public / internal / confidential). These are `json_schema` content properties — not `FieldConfig`
keys — so the strict entity-type write-gate never strips them.
## Retrieval — `getContextPack` (the front door)
`getContextPack` is a read-only, external MCP tool that fuses the three existing hybrid searches
into one ranked, cited, token-budgeted payload — so an agent makes **one call**, not three.
```
getContextPack({ query, typeSlug?, maxEntities?, maxDocuments?, maxConnections? })
→ {
entities: [{ id, title, type, typeSlug, href, content }], // hybrid semantic+FTS, compact refs
documents: [{ documentId, documentTitle, page, score, snippet }], // hybrid chunk RAG, cited
connections: [{ id, title, type, typeSlug, href, relationshipType, viaEntityId }], // graph neighbors
summary: { entityCount, documentCount, connectionCount }
}
```
It returns **compact references** (just-in-time retrieval, per Anthropic's context-engineering
guidance) — dereference a full record with `getEntity`. It is tenant-scoped and never crosses a
tenant boundary. Implementation: `features/tools/context-pack-tools.ts` (composition over
`searchEntitiesHybrid` + `searchDocumentsHybrid` + an `entity_relations` graph walk).
## For agents — how to consume the brain
1. **Lookup, don't dump.** Call `getContextPack({ query })` first. Expand only what you need with
`getEntity`. Keep context small.
2. **Cite.** Every claim in the brain carries provenance — surface it. Ground or abstain: if the
brain doesn't know, say "not in the brain", never guess.
3. **Respect sensitivity.** Honor each record's `sensitivity` when deciding what to surface.
4. **Write back with discipline.** Agent-created records get the same provenance, confidence, and
`sourced_from` edges as human writes. Canonicalize before creating — `searchEntities` /
`getContextPack` to find an existing node and update it (append provenance, add an alias) rather
than creating a duplicate.
## Zero-instruction operation — the ambient brain
The brain is **ambient**: an external agent (Codex CLI, Claude Code on a tech team) connecting to
a tenant's MCP server learns to use Amble as its company brain + control plane *without* a repo
rule, `AGENTS.md`, or explicit instruction. The mechanism is the MCP spec's
`Implementation.instructions` field — surfaced to every client on `initialize`. Amble sets it in
`features/mcp/amble-server.ts` (`AMBLE_MCP_INSTRUCTIONS`): retrieve-first via `getContextPack`,
write canonical records back, pick up work via the goal-loop tools, report via
`reportAgentActivity`, and pull operating method via `ambleGetSkill`. Tenant-agnostic — the
connecting URL already pins the tenant.
## The one command — method + goal both live in Amble
Everything an agent needs is pulled from Amble over MCP, so every agent (Claude Code, Codex,
OpenClaw) gets the same latest version with no local copy to drift:
- **The method** is the `company-brain-ingestion` **platform skill** (`features/skills/platform-skills.ts`)
— the operating playbook (compiler process, canonicalize, provenance, validation gates, retrieval
audit, safety). Pull it with `ambleGetSkill("company-brain-ingestion")`.
- **The goal** is the `company-brain-ingestion` **goal-loop template** (`features/entities/lib/goal-loop-templates.ts`)
— a generic, every-tenant template. Stand it up with
`createGoalFromTemplate({ templateSlug: "company-brain-ingestion" })`.
- **The launcher** is the repo skill `.claude/skills/company-brain/SKILL.md` — the thin bootstrap a
Claude Code / Codex agent invokes to kick the loop off.
Filling the brain is then a **goal loop**, not a one-shot script: the agent reads its contract with
`getGoalBrief` / `getGoalLog` (both external reads) and runs
`listGoalWork → claimGoalWork → closeGoalWork(proof)` — claiming only `controlled_execute` work,
renewing under the ~12h lease, reporting via `reportAgentActivity`, resuming from
`getAgentHandoffPacket`. The full, copy-ready prompt lives at
`documents/prompts/company-brain-ingestion-codex.md`.
## Rollout — provision a tenant (Sprinter, Marbella, OCI)
1. **Install the node types.** `POST /api/plugins` with
`{ action: "install", install: { source: "manifest", manifestName: "@amble/company-brain", workspaceId } }`
(admin / operator) — idempotent; seeds the 9 canonical types on that tenant.
2. **Provision a key.** An API key with scope `tools:execute` and a role granting
`entities.team.read` + `entities.team.update`, owned by a current member of the tenant.
3. **Point the agent at it.** `POST /api/mcp/t/{tenantSlug}/server` with the Bearer key — the
server instructions orient the agent automatically.
4. **Arm the loop.** `createGoalFromTemplate` leaves the goal `draft`; arming it (draft → running)
is a one-time human step in the Amble UI. Thereafter the loop is agent-driven at the
`controlled_execute` tier.
## Design decisions
- **Extend, don't add.** The brain is entity-graph + documents + retrieval composition — not a new
table or memory store (`no-parallel-systems`). `getContextPack` is the only new code; it wraps
existing RPCs.
- **Provenance lives in the schema and on edges**, reusing the value-level `field_sources` /
`lineage` model and `sourced_from` relations — not a bespoke provenance type.
- **External reads, gated.** `getGoalBrief`/`getGoalLog`/`getContextPack` are `entities.team.read`
and tenant-scoped; goal-WRITE tools stay internal.
- **Ambient by server instructions, not a new tool.** The "use Amble as your brain" contract rides
the MCP spec's `instructions` field — no dedicated tool, no per-agent config (`thin-harness`,
`feedback_fewer_declarative_tools`).
- **Method and goal are data, pulled over MCP.** The ingestion playbook is a platform *skill* and
the loop is a goal-loop *template* — both version-controlled, both retrieved live, so no agent
carries a stale local copy.
## Related
- [Knowledge Loops](/docs/features/knowledge-loops) — the feed → synthesize → review → eval cycle
- [Shared Knowledge](/docs/features/shared-knowledge) — team memory injected into agent prompts
- [Tool System](/docs/features/tool-system) — the MCP tool surface
- [Document Processing](/docs/features/document-processing) — chunking + hybrid retrieval
- [Bundles](/docs/features/bundles) — how `@amble/company-brain` installs