Documentation source

Context Compaction for Long Conversations

Summarize conversation history when approaching context limits to preserve key facts, avoid duplicate work, and maintain task continuity

## Problem

When conversations get long (many tool calls, entity operations), the current context management strategy (`clear_tool_uses_20250919`) simply drops older tool inputs/outputs. This causes:

1. **Agent re-does work** — creates duplicate entities because it forgot it already created them
2. **Loses task list** — the `reportProgress` tool call that tracks multi-step work gets dropped
3. **Loses decisions** — earlier conclusions, user corrections, and key facts vanish
4. **No summary** — the model has no "condensed memory" of what happened before the window

## Solution

Implement a **compaction step** that runs before context management triggers. Instead of silently dropping tool uses, generate a structured summary that preserves:

- **Created/modified entities** (IDs, titles, types) — prevents re-creation
- **Current task list** (from latest `reportProgress`) — maintains continuity
- **Key decisions and corrections** — user feedback, pivots, constraints
- **Conversation summary** — what was discussed, what was concluded

### Architecture

```
User sends message
  → Load full history
  → Count tokens (estimate)
  → If tokens > COMPACTION_THRESHOLD (e.g., 60k):
      → Extract latest reportProgress tasks
      → Call a fast model (Haiku) to summarize older messages
      → Build compacted history:
          [system prompt]
          [compaction summary message with entity IDs, tasks, key facts]
          [last N messages verbatim]
      → Send compacted history to streamText
  → Else: send full history as normal
```

### Compaction Summary Format

```markdown
## Conversation Summary (auto-compacted)

### Entities Created/Modified
- [Entity Type] "Title" (id: uuid) — status/outcome

### Current Task List
- [x] Task 1 (completed)
- [ ] Task 2 (in progress)
- [ ] Task 3 (pending)

### Key Decisions
- User confirmed X approach over Y
- Field Z should use format A

### Context
Brief narrative of what was discussed and concluded.
```

### Design Considerations

- **Compaction is lossy** — always keep the last N messages verbatim (at least 10)
- **Entity IDs are critical** — the summary MUST include entity IDs to prevent re-creation
- **Task list is critical** — extract from the latest `reportProgress` call, not from summary
- **Use a cheaper/faster model** for summarization (Haiku) to minimize latency and cost
- **Compaction is transparent** — the user doesn't see it, but admins can see a "compacted" indicator via the config info icon
- **Threshold tuning** — compaction should trigger well before context management (e.g., 60k vs 80k), so the model gets the summary instead of raw truncation

### Alternatives Considered

1. **Sliding window only** — loses context, causes duplicate work (current behavior)
2. **Vector retrieval (RAG)** — over-engineered for conversation continuity, better for knowledge base
3. **Client-side summarization** — adds latency on client, can't access full tool outputs

## Acceptance Criteria

- [ ] Long conversations (20+ exchanges with tool calls) don't lose key facts
- [ ] Agent doesn't re-create entities it already created earlier in the conversation
- [ ] Task list from `reportProgress` survives compaction
- [ ] Compaction latency adds < 2s to response time
- [ ] Compaction only triggers when needed (short conversations unaffected)
- [ ] Admin config info icon shows when a message used compacted context

## Files

- `features/agents/runtime.ts` — add compaction step before streamText
- `features/agents/provider-options.ts` — adjust thresholds
- `features/chat/message-utils.ts` — extract task list and entity IDs from message history
- `app/api/chat/route.ts` — wire up compaction in chat flow
- `features/agents/lib/compact-context.ts` — new: compaction logic

Documentation source

Context Compaction for Long Conversations

Summarize conversation history when approaching context limits to preserve key facts, avoid duplicate work, and maintain task continuity

## Problem

When conversations get long (many tool calls, entity operations), the current context management strategy (`clear_tool_uses_20250919`) simply drops older tool inputs/outputs. This causes:

1. **Agent re-does work** — creates duplicate entities because it forgot it already created them
2. **Loses task list** — the `reportProgress` tool call that tracks multi-step work gets dropped
3. **Loses decisions** — earlier conclusions, user corrections, and key facts vanish
4. **No summary** — the model has no "condensed memory" of what happened before the window

## Solution

Implement a **compaction step** that runs before context management triggers. Instead of silently dropping tool uses, generate a structured summary that preserves:

- **Created/modified entities** (IDs, titles, types) — prevents re-creation
- **Current task list** (from latest `reportProgress`) — maintains continuity
- **Key decisions and corrections** — user feedback, pivots, constraints
- **Conversation summary** — what was discussed, what was concluded

### Architecture

```
User sends message
  → Load full history
  → Count tokens (estimate)
  → If tokens > COMPACTION_THRESHOLD (e.g., 60k):
      → Extract latest reportProgress tasks
      → Call a fast model (Haiku) to summarize older messages
      → Build compacted history:
          [system prompt]
          [compaction summary message with entity IDs, tasks, key facts]
          [last N messages verbatim]
      → Send compacted history to streamText
  → Else: send full history as normal
```

### Compaction Summary Format

```markdown
## Conversation Summary (auto-compacted)

### Entities Created/Modified
- [Entity Type] "Title" (id: uuid) — status/outcome

### Current Task List
- [x] Task 1 (completed)
- [ ] Task 2 (in progress)
- [ ] Task 3 (pending)

### Key Decisions
- User confirmed X approach over Y
- Field Z should use format A

### Context
Brief narrative of what was discussed and concluded.
```

### Design Considerations

- **Compaction is lossy** — always keep the last N messages verbatim (at least 10)
- **Entity IDs are critical** — the summary MUST include entity IDs to prevent re-creation
- **Task list is critical** — extract from the latest `reportProgress` call, not from summary
- **Use a cheaper/faster model** for summarization (Haiku) to minimize latency and cost
- **Compaction is transparent** — the user doesn't see it, but admins can see a "compacted" indicator via the config info icon
- **Threshold tuning** — compaction should trigger well before context management (e.g., 60k vs 80k), so the model gets the summary instead of raw truncation

### Alternatives Considered

1. **Sliding window only** — loses context, causes duplicate work (current behavior)
2. **Vector retrieval (RAG)** — over-engineered for conversation continuity, better for knowledge base
3. **Client-side summarization** — adds latency on client, can't access full tool outputs

## Acceptance Criteria

- [ ] Long conversations (20+ exchanges with tool calls) don't lose key facts
- [ ] Agent doesn't re-create entities it already created earlier in the conversation
- [ ] Task list from `reportProgress` survives compaction
- [ ] Compaction latency adds < 2s to response time
- [ ] Compaction only triggers when needed (short conversations unaffected)
- [ ] Admin config info icon shows when a message used compacted context

## Files

- `features/agents/runtime.ts` — add compaction step before streamText
- `features/agents/provider-options.ts` — adjust thresholds
- `features/chat/message-utils.ts` — extract task list and entity IDs from message history
- `app/api/chat/route.ts` — wire up compaction in chat flow
- `features/agents/lib/compact-context.ts` — new: compaction logic