Documentation source
Shared Knowledge Graph
Cross-tenant curated knowledge library — AI trends, use cases, and other published entities flow into opted-in tenants via a copy-then-read sync pipeline. Zero RLS changes. Reads stay within-tenant.
## Overview
The Shared Knowledge Graph lets Amble maintain a curated library of authoritative
entities — AI trends, use cases, research signals — that any opted-in tenant can
access as a read-only "Shared library" layer inside their own graph. Knowledge
curated for one customer automatically benefits all participants, and the platform
becomes smarter as the commons grows.
The implementation follows the **copy-then-read** model (ADR-0059): a service-role
Inngest job periodically copies published entities from the `amble-commons` tenant
into each opted-in tenant's own `entities` table. Reads stay 100% within-tenant.
Zero RLS changes. Zero revision to ADR-0003 (URL as sole tenant truth).
## Core Concepts
### The Commons Tenant
`amble-commons` (`AMBLE_COMMONS_TENANT_ID` in `features/tenant/constants.ts`) is
a dedicated system tenant that holds the curated library. It is **not** the default
signup tenant (`DEFAULT_TENANT_ID`) — the two are kept separate to avoid entangling
curated content with guest RLS and public signup flows.
All reads from and writes to the commons tenant use a service-role admin client
(`createAdminClient()`) with an explicit `tenant_id`. No human membership in the
commons tenant is required or seeded — curator authorization uses the global
`system_admin` role instead (see [Curating content](#curating-content)).
### Row Classes in the Commons
Two distinct row types live in the commons tenant. **Do not conflate them.**
| Class | `visibility` | `external_source` | Purpose |
|---|---|---|---|
| Published library row | `'shared'` | `'commons-curated'` | Synced out to consumers |
| Pending proposal row | `'private'` | `'tenant-proposal'` | Curator inbox; never synced |
A published row and the proposal that spawned it are **different rows**. Re-proposing
an already-published item creates a new pending proposal (revision cycle) and never
flips the published row back to private. This prevents a de-publishing loop where
any tenant could silently remove content from all consumers.
### Synced Mirror Rows (Consumer Tenant)
When a tenant opts in to consume, synced copies carry provenance:
```
external_source = 'amble-commons'
external_id = <commons published entity id>
metadata.shared_origin = { commons_entity_id, version, synced_at }
visibility = 'tenant'
owner_id = NULL
```
The partial unique index `(tenant_id, external_source, external_id) WHERE external_id IS NOT NULL`
makes re-sync idempotent. Writes go through fetch→create/update (not raw upsert — the
partial index cannot be targeted safely by PostgREST's ON CONFLICT clause).
### Read-Only Guard
Synced mirror rows (`external_source = 'amble-commons'`) are **read-only to
everyone except the sync reconcile**. The guard lives in `EntityWriter` — NOT in
server actions — so it covers every write path: chat agents, API keys, keyed
integration upserts, bulk operations, and direct server actions. Create, update,
delete, and reparent all enforce it.
The capability is an **explicit `allowSharedLibraryWrite` flag**, not an actor
type. `actor.type === "system"` is too broad a key: the keyed integration paths
(`upsertEntityKeyed`, and `deleteEntityKeyed`/`reparentEntityKeyed` with no
attribution) also run as a system actor, so an actor-only gate would let a
connector configured with `externalSource: "amble-commons"` mutate, delete, or
spoof a mirror row. Only `syncSharedKnowledgeForTenant` sets the flag (mirrors the
`allowSystemAgent` no-implicit-pass-through pattern, ADR-0024). Every other
caller receives: _"This record is managed by the shared library and can't be
edited here."_ The opt-out purge bypasses `EntityWriter` entirely (a single
tenant- and source-scoped admin `DELETE`), so it needs no capability.
## Opt-In Settings
The `shared_knowledge_graph` setting key (added to `features/settings/types.ts`)
controls participation per tenant:
```ts
interface SharedKnowledgeGraphSettings {
consume?: boolean; // pull the curated commons library into this tenant
contribute?: boolean; // allow proposing entities back to the commons
}
```
Default (no row) = both `false`. Toggle via Admin > Knowledge or the
`updateTenantSettings` MCP tool. Persisted through the existing
`saveTenantSetting()` server action — no new write path.
### Enabling consume
When `consume` flips `false → true` the toggle action immediately emits
`"shared-knowledge/sync.requested"` (Inngest) so the tenant's graph is populated
within seconds. A daily cron reconciles freshness for all opted-in tenants.
### Disabling consume
When `consume` flips `true → false` the toggle action emits
`"shared-knowledge/purge.requested"`. The purge job deletes every
`external_source = 'amble-commons'` row from the tenant's graph — opt-out is
clean, not archival.
The immediate purge enqueue is best-effort (time-bounded so a slow broker can't
stall the optimistic admin save). The daily cron reconciles **both** directions:
it re-syncs opted-in tenants **and** re-purges orphaned `amble-commons` mirrors
for any tenant whose setting is no longer `consume: true` — so an opt-out whose
immediate purge enqueue failed is still cleaned up within a day. The purge is
idempotent (deletes only mirror rows), so fanning out to a never-consumed tenant
is a harmless no-op.
## Sync Pipeline
```
amble-commons (visibility='shared' rows)
│
│ Inngest: shared-knowledge/sync.requested
│ (on opt-in flip) + daily cron (re-syncs opted-in tenants,
│ re-purges orphaned mirrors for opted-out tenants)
▼
syncSharedKnowledgeForTenant(tenantId) [features/shared-knowledge/server/sync.ts]
│
├── Resolve entity_type_id for each commons row's entity_type_slug
│ (global type tenant_id IS NULL; skip if type is disabled by consumer)
│
├── Fetch existing mirror row by (tenant_id, 'amble-commons', external_id)
│
├── Create (new) or update (existing) via EntityWriter background WriteContext
│ (keeps activity log, cache revalidation, events)
│
└── Tombstone: delete consumer mirror rows whose external_id is no
longer in the published commons set (handles curator delete/unpublish)
```
Scalar content fields are copied (`title`, `description`, `content`, `tags`,
`image_url`, `source_url`). Relation-typed fields are ignored — cross-tenant
relation references would be broken. Relation sync is a planned fast-follow.
### Type filtering
If the consumer tenant has disabled a global entity type via
`tenant_entity_type_settings`, rows of that type are skipped during sync (not
errors). This lets a tenant opt in to consume the library but suppress types
irrelevant to their domain.
## Contribute → Curate → Publish Loop
### Proposing an entity
Any tenant with `contribute = true` can propose an entity to the commons from its
detail page ("Propose to shared library" action). The server action at
`features/shared-knowledge/server/propose.ts` validates:
1. Active tenant from URL (`getTenantContext()`).
2. Caller has `entities.team.update` permission.
3. Tenant setting `contribute === true`.
4. Source row `tenant_id` matches the active tenant (ownership guard — blocks
cross-tenant param injection even under service role).
5. Source row is not itself a synced mirror (`external_source !== 'amble-commons'`).
If all checks pass, a **pending proposal row** is created in the commons tenant.
Re-proposing the same entity updates the existing proposal (idempotent on
`external_id = '<origin_tenant_id>:<origin_entity_id>'`).
### Curator queue
Global `system_admin` users see the curator queue at Admin > Knowledge > Proposals.
Actions available:
- **Approve** — creates or refreshes a published library row (`external_source = 'commons-curated'`,
`visibility = 'shared'`); emits `"shared-knowledge/published"` to fan out sync
to every consuming tenant; marks the proposal `status = 'approved'`.
- **Reject** — sets `status = 'rejected'` with optional notes; the proposal stays
private and is never synced.
Authorization for curator actions uses the global `system_admin` role — NOT
commons tenant membership. This means any future `system_admin` can curate without
needing to be manually added as a commons member.
## Knowledge-loop integration
The commons is not a knowledge system parallel to the
[knowledge-loop lens](/docs/features/knowledge-loops) (ADR-0057) — it plugs into
one. A tenant's `KnowledgeLoopDefinition` references the commons as a first-class
**source** (consume) and **publish surface** (contribute), using the loop's own
vocabulary. The adapter is pure and lives in
`features/shared-knowledge/lib/knowledge-loop-source.ts`:
```ts
// consume → loop sources (kind "integration", sourceRef "commons:<collection>")
commonsKnowledgeSources({ consume, collections }): KnowledgeSource[]
// contribute → loop publish surface (ref "commons:contribute")
commonsContributionSurface({ contribute }): KnowledgeSurface | null
```
`knowledgeLoopToGoalLoopTemplate` carries a commons source into
`spec.workSources` and the contribute surface into
`spec.outputs[].publishDestinations`, so the commons rides the existing goal-loop
runner with no new source/surface kind. The dependency points one way:
shared-knowledge depends on the loop _types_; the loop contract stays
commons-agnostic (thin-harness). See `knowledge-loop-source.test.ts` for an
end-to-end loop that both consumes and contributes.
## Seed Content (Release Step)
The commons tenant is bootstrapped with curated content by running:
```bash
pnpm seed:types # ensure global ai_trend + use_case types exist first
pnpm seed:shared-knowledge # idempotent; safe to re-run
```
`scripts/seed-shared-knowledge-content.ts` seeds ≥8 `ai_trend` rows and ≥4
`use_case` rows with real 2026 AI content. Idempotency is by slug — re-running
updates in place, never duplicates.
The seed is a **release step**, not a migration. It runs against the deployed
database after the Lane A migration has been applied and `seed:types` has synced
the global entity types.
## Provenance Constants
All provenance markers are in `features/shared-knowledge/lib/constants.ts` (no
product entity-type slugs — platform boundary rule):
```ts
SHARED_ORIGIN_SOURCE = 'amble-commons' // consumer mirror rows
COMMONS_CURATED_SOURCE = 'commons-curated' // published library rows in commons
PROPOSAL_SOURCE = 'tenant-proposal' // pending proposal rows in commons
```
## Agent Instructions
Agents operating against a consuming tenant will see shared library rows as
ordinary entities with `external_source = 'amble-commons'`. The "Shared library"
badge in the UI signals read-only status.
**Agents should:**
- Read and reference shared library entities freely for context and analysis.
- Use `createEntity` or entity update tools on **tenant-owned** rows only.
- Surface the "Propose to shared library" action to users who want to contribute
a tenant entity back to the commons (requires `contribute = true` and
`entities.team.update` permission).
**Agents must not:**
- Attempt to update or delete a row with `external_source = 'amble-commons'` —
`EntityWriter` will reject it with a clear error message.
- Create new entities with `external_source = 'amble-commons'` — that value is
reserved for the sync pipeline.
- Make cross-tenant reads to fetch commons content directly — reads are always
within the active tenant; the sync pipeline handles cross-tenant movement.
## Key Files
| Path | Purpose |
|---|---|
| `features/shared-knowledge/lib/constants.ts` | Provenance string constants |
| `features/shared-knowledge/lib/knowledge-loop-source.ts` | Commons → knowledge-loop source/surface adapter |
| `features/shared-knowledge/server/settings.ts` | Opt-in setting readers (request + background) |
| `features/shared-knowledge/server/sync.ts` | Full-reconcile sync job |
| `features/shared-knowledge/server/cleanup.ts` | Opt-out purge |
| `features/shared-knowledge/server/propose.ts` | Tenant → commons proposal |
| `features/shared-knowledge/server/curate.ts` | Curator approve/reject |
| `features/shared-knowledge/components/shared-knowledge-toggle.tsx` | Admin toggle UI |
| `features/inngest/functions/sync-shared-knowledge.ts` | Inngest handlers + cron |
| `features/tenant/constants.ts` | `AMBLE_COMMONS_TENANT_ID`, `AMBLE_COMMONS_TENANT_SLUG` |
| `scripts/seed-shared-knowledge-content.ts` | Curated content seed (release step) |
| `features/shared-knowledge/lib/collections.ts` | Collection catalog + pure selector |
| `features/shared-knowledge/server/collections.ts` | `enableKnowledgeCollectionsForTenant` (install hook) |
| `features/shared-knowledge/components/knowledge-collections-picker.tsx` | Admin collections picker UI |
## Knowledge collections (selective install)
Consuming the commons is not all-or-nothing. A tenant can install **named
collections** — topical/modality slices of the curated library — instead of the
whole thing (e.g. clinical knowledge bundled by modality, so each location
installs only what its equipment needs).
- A collection is a **tag on the live commons row** (`metadata.collections: string[]`),
not a frozen copy. Collections stay curated in the commons and keep syncing —
they get smarter over time like any other commons content.
- The installable set is a small code catalog: `COMMONS_COLLECTIONS` in
`features/shared-knowledge/lib/collections.ts` (`{ slug, title, description }`).
- The opt-in setting gains `collections?: string[]`. Semantics: `consume: true`
with an **empty** `collections` list = **all collections** (legacy, preserved);
a **non-empty** list = mirror only rows whose `metadata.collections` intersects
the installed set. `selectConsumableRows()` is the pure selector; the sync uses
its output for both the mirror loop and the tombstone set, so **uninstalling a
collection tombstones its mirror rows on the next sync** — no new delete path.
- Admins manage collections at **Admin → Knowledge** (`/admin/knowledge`) via the
`KnowledgeCollectionsPicker`.
### Bundling collections with workspaces / templates
Collections are declared **by reference**, not copied as `seedData`. A
`BundleManifest` (and therefore a `WorkspaceTemplate.__manifest`) carries an
optional `knowledgeCollections: string[]`. On install, `installBundle()`'s
post-commit hook calls `enableKnowledgeCollectionsForTenant(tenantId, slugs)`,
which **additively** flips the tenant's live opt-in (it never narrows a tenant
already consuming everything) and enqueues a sync. The SKG sync engine then
mirrors the live rows. This reuses the install path as the **trigger** and the
sync engine as the **transport** — no second data-copy mechanism, and the data
is a live subscription rather than a static snapshot. (Use `seedData` for
tenant-private starter records; use `knowledgeCollections` for shared curated
knowledge.)
A bundle whose knowledge subscription fails to enable still installs
successfully — `installBundle()` surfaces a soft `knowledgeCollectionsWarning`
(repairable from `/admin/knowledge`) rather than failing the install.
## References
- [ADR-0059: Copy-then-read for the shared knowledge commons](/docs/adr/0059) —
why not federated cross-tenant reads; commons tenant design; provenance; idempotency;
read-only capability guard; knowledge-loop relationship.
- [ADR-0057: Knowledge-loop definitions](/docs/adr/0057) — the lens the commons
plugs into as a source / publish surface.
- [ADR-0056: Open Knowledge Format](/docs/adr/0056) — copy-then-read as the
synchronization primitive for cross-context knowledge sharing.
- [ADR-0003: URL as sole tenant truth](/docs/adr/0003) — the constraint that ruled
out federated reads.
- [Settings](/docs/features/settings) — how `shared_knowledge_graph` is stored
and read.
- [Entity System](/docs/features/entities) — `EntityWriter`, `WriteContext`, the
partial unique identity index.