Documentation source
Context Compaction for Long Conversations
Summarize conversation history when approaching context limits to preserve key facts, avoid duplicate work, and maintain task continuity
## Problem
When conversations get long (many tool calls, entity operations), the current context management strategy (`clear_tool_uses_20250919`) simply drops older tool inputs/outputs. This causes:
1. **Agent re-does work** — creates duplicate entities because it forgot it already created them
2. **Loses task list** — the `reportProgress` tool call that tracks multi-step work gets dropped
3. **Loses decisions** — earlier conclusions, user corrections, and key facts vanish
4. **No summary** — the model has no "condensed memory" of what happened before the window
## Solution
Implement a **compaction step** that runs before context management triggers. Instead of silently dropping tool uses, generate a structured summary that preserves:
- **Created/modified entities** (IDs, titles, types) — prevents re-creation
- **Current task list** (from latest `reportProgress`) — maintains continuity
- **Key decisions and corrections** — user feedback, pivots, constraints
- **Conversation summary** — what was discussed, what was concluded
### Architecture
```
User sends message
→ Load full history
→ Count tokens (estimate)
→ If tokens > COMPACTION_THRESHOLD (e.g., 60k):
→ Extract latest reportProgress tasks
→ Call a fast model (Haiku) to summarize older messages
→ Build compacted history:
[system prompt]
[compaction summary message with entity IDs, tasks, key facts]
[last N messages verbatim]
→ Send compacted history to streamText
→ Else: send full history as normal
```
### Compaction Summary Format
```markdown
## Conversation Summary (auto-compacted)
### Entities Created/Modified
- [Entity Type] "Title" (id: uuid) — status/outcome
### Current Task List
- [x] Task 1 (completed)
- [ ] Task 2 (in progress)
- [ ] Task 3 (pending)
### Key Decisions
- User confirmed X approach over Y
- Field Z should use format A
### Context
Brief narrative of what was discussed and concluded.
```
### Design Considerations
- **Compaction is lossy** — always keep the last N messages verbatim (at least 10)
- **Entity IDs are critical** — the summary MUST include entity IDs to prevent re-creation
- **Task list is critical** — extract from the latest `reportProgress` call, not from summary
- **Use a cheaper/faster model** for summarization (Haiku) to minimize latency and cost
- **Compaction is transparent** — the user doesn't see it, but admins can see a "compacted" indicator via the config info icon
- **Threshold tuning** — compaction should trigger well before context management (e.g., 60k vs 80k), so the model gets the summary instead of raw truncation
### Alternatives Considered
1. **Sliding window only** — loses context, causes duplicate work (current behavior)
2. **Vector retrieval (RAG)** — over-engineered for conversation continuity, better for knowledge base
3. **Client-side summarization** — adds latency on client, can't access full tool outputs
## Acceptance Criteria
- [ ] Long conversations (20+ exchanges with tool calls) don't lose key facts
- [ ] Agent doesn't re-create entities it already created earlier in the conversation
- [ ] Task list from `reportProgress` survives compaction
- [ ] Compaction latency adds < 2s to response time
- [ ] Compaction only triggers when needed (short conversations unaffected)
- [ ] Admin config info icon shows when a message used compacted context
## Files
- `features/agents/runtime.ts` — add compaction step before streamText
- `features/agents/provider-options.ts` — adjust thresholds
- `features/chat/message-utils.ts` — extract task list and entity IDs from message history
- `app/api/chat/route.ts` — wire up compaction in chat flow
- `features/agents/lib/compact-context.ts` — new: compaction logic