Repo: https://github.com/anomalyco/opencode (commit 22a4c5a)
opencode compaction has two prompt components that combine at runtime.
packages/opencode/src/agent/prompt/compaction.txt
You are a helpful AI assistant tasked with summarizing conversations.
When asked to summarize, provide a detailed but concise summary of the conversation.
Focus on information that would be helpful for continuing the conversation, including:
- What was done
- What is currently being worked on
- Which files are being modified
- What needs to be done next
- Key user requests, constraints, or preferences that should persist
- Important technical decisions and why they were made
Your summary should be comprehensive enough to provide context but concise enough to be quickly understood.
Do not respond to any questions in the conversation, only output the summary.
packages/opencode/src/session/compaction.ts:173
Appended after the full conversation history at compaction time:
Provide a detailed prompt for continuing our conversation above.
Focus on information that would be helpful for continuing the conversation, including what we did,
what we're doing, which files we're working on, and what we're going to do next.
The summary that you construct will be used so that another agent can read it and continue the work.
When constructing the summary, try to stick to this template:
---
## Goal
[What goal(s) is the user trying to accomplish?]
## Instructions
- [What important instructions did the user give you that are relevant]
- [If there is a plan or spec, include information about it so next agent can continue using it]
## Discoveries
[What notable things were learned during this conversation that would be useful for the next agent
to know when continuing the work]
## Accomplished
[What work has been completed, what work is still in progress, and what work is left?]
## Relevant files / directories
[Construct a structured list of relevant files that have been read, edited, or created that pertain
to the task at hand. If all the files in a directory are relevant, include the path to the directory.]
---
Plugins can replace this prompt entirely or append context via experimental.session.compacting
(compaction.ts:168).
opencode doesn't compact inline. The flow is:
turn completes
↓ token count checked via isOverflow()
↓ if overflow → SessionCompaction.create() writes CompactionPart marker to DB
next loop iteration
↓ filterCompacted() loads history (stops at last compaction boundary)
↓ sees pending CompactionPart
↓ SessionCompaction.process(): full history + summary prompt → LLM
↓ response stored as assistant message, summary: true
↓ synthetic "continue or ask" user message appended
next loop iteration
↓ filterCompacted() now breaks at the summary message
↓ conversation continues with compacted history only
export async function filterCompacted(stream: AsyncIterable<MessageV2.WithParts>) {
const result = [] as MessageV2.WithParts[]
const completed = new Set<string>()
for await (const msg of stream) {
result.push(msg)
if (
msg.info.role === "user" &&
completed.has(msg.info.id) &&
msg.parts.some((part) => part.type === "compaction")
)
break
if (msg.info.role === "assistant" && msg.info.summary && msg.info.finish && !msg.info.error)
completed.add(msg.info.parentID)
}
result.reverse()
return result
}Scans backward. When it finds a user message that has a successful summary: true assistant response and contains a compaction part, it stops. Everything before that point is dropped from active context.
const COMPACTION_BUFFER = 20_000
export async function isOverflow(input: { tokens; model }) {
if (config.compaction?.auto === false) return false
const context = input.model.limit.context
if (context === 0) return false
const count =
input.tokens.total ||
input.tokens.input + input.tokens.output + input.tokens.cache.read + input.tokens.cache.write
const reserved =
config.compaction?.reserved ?? Math.min(COMPACTION_BUFFER, ProviderTransform.maxOutputTokens(input.model))
const usable = input.model.limit.input
? input.model.limit.input - reserved
: context - ProviderTransform.maxOutputTokens(input.model)
return count >= usable
}maxOutputTokens(model) (transform.ts:875):
export const OUTPUT_TOKEN_MAX = Flag.OPENCODE_EXPERIMENTAL_OUTPUT_TOKEN_MAX || 32_000
export function maxOutputTokens(model): number {
return Math.min(model.limit.output, OUTPUT_TOKEN_MAX) || OUTPUT_TOKEN_MAX
}Effective formula:
reserved = config.compaction.reserved
?? min(20000, min(model.output_limit, 32000))
usable = model.limit.input
? model.limit.input - reserved
: context - min(model.output_limit, 32000)
trigger = actual_tokens >= usable
Uses actual token count from API response — no estimates. For most models this fires at ~96–99% of context.
Example (Claude Sonnet 200k context, 8k output):
- reserved = min(20000, 8000) = 8000
- usable = 200000 − 8000 = 192000 (96%)
After each step-finish event in the inner stream loop, checks isOverflow(). Sets needsCompaction = true, breaks inner loop, returns "compact" to outer loop.
Outer loop (prompt.ts:705):
if (result === "compact") {
await SessionCompaction.create({
sessionID,
agent: lastUser.agent,
model: lastUser.model,
auto: true,
overflow: !processor.message.finish, // true if turn didn't complete
})
}if (lastFinished && lastFinished.summary !== true &&
(await SessionCompaction.isOverflow({ tokens: lastFinished.tokens, model }))) {
await SessionCompaction.create({ sessionID, agent, model, auto: true })
continue
}Catches overflow that was detected but not yet actioned from a previous session.
When provider returns a ContextOverflowError (HTTP 400 / provider overflow signal), needsCompaction = true — same flow as #1.
After a successful compaction with auto: true:
- Overflow + replay message: Re-creates the triggering user message as a new message. Media files (images, PDFs) are replaced by text:
[Attached image/jpeg: filename.jpg]. - Normal overflow: Prepends note about media being stripped:
"The previous request exceeded the provider's size limit due to large media attachments..." - No replay: Synthetic user message:
"Continue if you have next steps, or stop and ask for clarification if you are unsure how to proceed."
Runs after every session (prompt.ts:716):
export const PRUNE_MINIMUM = 20_000
export const PRUNE_PROTECT = 40_000
const PRUNE_PROTECTED_TOOLS = ["skill"]Algorithm:
- Walk backwards through tool results
- Keep most recent
PRUNE_PROTECT(40k) tokens of tool outputs intact - If tool outputs beyond that zone total >
PRUNE_MINIMUM(20k): stamptime.compactedon each - When serialized →
"[Old tool result content cleared]"
Never crosses a compaction boundary. No LLM involved. Protected tools (skill) are never pruned.
opencode.json:
{
"compaction": {
"auto": false, // disable compaction entirely (default: true)
"prune": false, // disable tool output pruning (default: true)
"reserved": 30000 // token buffer (default: min(20000, model output limit))
},
"agents": {
"compaction": {
"model": {
"providerID": "anthropic",
"modelID": "claude-3-5-haiku-20241022"
}
}
}
}Environment variables (flag.ts:19):
OPENCODE_DISABLE_AUTOCOMPACT— disables auto compactionOPENCODE_DISABLE_PRUNE— disables pruningOPENCODE_EXPERIMENTAL_OUTPUT_TOKEN_MAX— overrides output token cap (default: 32,000)
compaction: {
name: "compaction",
mode: "primary",
native: true,
hidden: true, // not shown in UI
prompt: PROMPT_COMPACTION,
permission: PermissionNext.merge(
defaults,
PermissionNext.fromConfig({ "*": "deny" }), // ALL tools denied
user,
),
options: {},
}- Hidden from user-facing agent list
- No tools available (
tools: {}at call site) - All permissions denied
- Uses same model as the conversation by default
- Override via
agents.compaction.modelin config
| Property | Value |
|---|---|
| Mechanism | Marker → filterCompacted() truncation loop |
| Trigger | actual_tokens >= context - reserved_output_buffer |
| Default threshold | ~96–99% of context |
| Token source | Actual API response token count |
| Configurable | Yes — reserved, auto, model override, env vars |
| Prompt | System prompt + structured template (Goal/Instructions/Discoveries/Accomplished/Files) |
| History to LLM | Full history up to compaction point, media stripped |
| Post-compaction | Synthetic "continue or ask" user message |
| Overflow replay | Triggering message re-created, media → text placeholder |
| Prune | Separate, no LLM — marks old tool outputs cleared |
| Plugin hook | experimental.session.compacting |