Skip to content

Instantly share code, notes, and snippets.

@sam-saffron-jarvis
Created March 4, 2026 22:04
Show Gist options
  • Select an option

  • Save sam-saffron-jarvis/2c8a6c26297f5748dac55b597381d268 to your computer and use it in GitHub Desktop.

Select an option

Save sam-saffron-jarvis/2c8a6c26297f5748dac55b597381d268 to your computer and use it in GitHub Desktop.
opencode compaction deep dive — prompts, threshold, mechanism, edge cases

opencode — Compaction Deep Dive

Repo: https://github.com/anomalyco/opencode (commit 22a4c5a)


Prompts

opencode compaction has two prompt components that combine at runtime.

1. Compaction Agent System Prompt

packages/opencode/src/agent/prompt/compaction.txt

You are a helpful AI assistant tasked with summarizing conversations.

When asked to summarize, provide a detailed but concise summary of the conversation.
Focus on information that would be helpful for continuing the conversation, including:
- What was done
- What is currently being worked on
- Which files are being modified
- What needs to be done next
- Key user requests, constraints, or preferences that should persist
- Important technical decisions and why they were made

Your summary should be comprehensive enough to provide context but concise enough to be quickly understood.

Do not respond to any questions in the conversation, only output the summary.

2. User Prompt (appended as final turn)

packages/opencode/src/session/compaction.ts:173

Appended after the full conversation history at compaction time:

Provide a detailed prompt for continuing our conversation above.
Focus on information that would be helpful for continuing the conversation, including what we did,
what we're doing, which files we're working on, and what we're going to do next.
The summary that you construct will be used so that another agent can read it and continue the work.

When constructing the summary, try to stick to this template:
---
## Goal

[What goal(s) is the user trying to accomplish?]

## Instructions

- [What important instructions did the user give you that are relevant]
- [If there is a plan or spec, include information about it so next agent can continue using it]

## Discoveries

[What notable things were learned during this conversation that would be useful for the next agent
to know when continuing the work]

## Accomplished

[What work has been completed, what work is still in progress, and what work is left?]

## Relevant files / directories

[Construct a structured list of relevant files that have been read, edited, or created that pertain
to the task at hand. If all the files in a directory are relevant, include the path to the directory.]
---

Plugins can replace this prompt entirely or append context via experimental.session.compacting (compaction.ts:168).


Mechanism: Marker → filter loop

opencode doesn't compact inline. The flow is:

turn completes
    ↓ token count checked via isOverflow()
    ↓ if overflow → SessionCompaction.create() writes CompactionPart marker to DB
next loop iteration
    ↓ filterCompacted() loads history (stops at last compaction boundary)
    ↓ sees pending CompactionPart
    ↓ SessionCompaction.process(): full history + summary prompt → LLM
    ↓ response stored as assistant message, summary: true
    ↓ synthetic "continue or ask" user message appended
next loop iteration
    ↓ filterCompacted() now breaks at the summary message
    ↓ conversation continues with compacted history only

filterCompacted() — history truncation

message-v2.ts:809

export async function filterCompacted(stream: AsyncIterable<MessageV2.WithParts>) {
  const result = [] as MessageV2.WithParts[]
  const completed = new Set<string>()
  for await (const msg of stream) {
    result.push(msg)
    if (
      msg.info.role === "user" &&
      completed.has(msg.info.id) &&
      msg.parts.some((part) => part.type === "compaction")
    )
      break
    if (msg.info.role === "assistant" && msg.info.summary && msg.info.finish && !msg.info.error)
      completed.add(msg.info.parentID)
  }
  result.reverse()
  return result
}

Scans backward. When it finds a user message that has a successful summary: true assistant response and contains a compaction part, it stops. Everything before that point is dropped from active context.


Threshold

compaction.ts:32

const COMPACTION_BUFFER = 20_000

export async function isOverflow(input: { tokens; model }) {
  if (config.compaction?.auto === false) return false
  const context = input.model.limit.context
  if (context === 0) return false

  const count =
    input.tokens.total ||
    input.tokens.input + input.tokens.output + input.tokens.cache.read + input.tokens.cache.write

  const reserved =
    config.compaction?.reserved ?? Math.min(COMPACTION_BUFFER, ProviderTransform.maxOutputTokens(input.model))
  const usable = input.model.limit.input
    ? input.model.limit.input - reserved
    : context - ProviderTransform.maxOutputTokens(input.model)
  return count >= usable
}

maxOutputTokens(model) (transform.ts:875):

export const OUTPUT_TOKEN_MAX = Flag.OPENCODE_EXPERIMENTAL_OUTPUT_TOKEN_MAX || 32_000
export function maxOutputTokens(model): number {
  return Math.min(model.limit.output, OUTPUT_TOKEN_MAX) || OUTPUT_TOKEN_MAX
}

Effective formula:

reserved  = config.compaction.reserved
          ?? min(20000, min(model.output_limit, 32000))

usable    = model.limit.input
          ? model.limit.input - reserved
          : context - min(model.output_limit, 32000)

trigger   = actual_tokens >= usable

Uses actual token count from API response — no estimates. For most models this fires at ~96–99% of context.

Example (Claude Sonnet 200k context, 8k output):

  • reserved = min(20000, 8000) = 8000
  • usable = 200000 − 8000 = 192000 (96%)

Three Trigger Points

1. Post-turn (processor.ts)

processor.ts:282

After each step-finish event in the inner stream loop, checks isOverflow(). Sets needsCompaction = true, breaks inner loop, returns "compact" to outer loop.

Outer loop (prompt.ts:705):

if (result === "compact") {
  await SessionCompaction.create({
    sessionID,
    agent: lastUser.agent,
    model: lastUser.model,
    auto: true,
    overflow: !processor.message.finish, // true if turn didn't complete
  })
}

2. Pre-turn (prompt.ts)

prompt.ts:542

if (lastFinished && lastFinished.summary !== true &&
    (await SessionCompaction.isOverflow({ tokens: lastFinished.tokens, model }))) {
  await SessionCompaction.create({ sessionID, agent, model, auto: true })
  continue
}

Catches overflow that was detected but not yet actioned from a previous session.

3. API-level context overflow (processor.ts)

processor.ts:358

When provider returns a ContextOverflowError (HTTP 400 / provider overflow signal), needsCompaction = true — same flow as #1.


Post-Compaction Injection

compaction.ts:235

After a successful compaction with auto: true:

  • Overflow + replay message: Re-creates the triggering user message as a new message. Media files (images, PDFs) are replaced by text: [Attached image/jpeg: filename.jpg].
  • Normal overflow: Prepends note about media being stripped: "The previous request exceeded the provider's size limit due to large media attachments..."
  • No replay: Synthetic user message: "Continue if you have next steps, or stop and ask for clarification if you are unsure how to proceed."

Prune (separate, no LLM)

compaction.ts:58

Runs after every session (prompt.ts:716):

export const PRUNE_MINIMUM = 20_000
export const PRUNE_PROTECT = 40_000
const PRUNE_PROTECTED_TOOLS = ["skill"]

Algorithm:

  1. Walk backwards through tool results
  2. Keep most recent PRUNE_PROTECT (40k) tokens of tool outputs intact
  3. If tool outputs beyond that zone total > PRUNE_MINIMUM (20k): stamp time.compacted on each
  4. When serialized → "[Old tool result content cleared]"

Never crosses a compaction boundary. No LLM involved. Protected tools (skill) are never pruned.


Configuration

config.ts:1138

opencode.json:

{
  "compaction": {
    "auto": false,        // disable compaction entirely (default: true)
    "prune": false,       // disable tool output pruning (default: true)
    "reserved": 30000     // token buffer (default: min(20000, model output limit))
  },
  "agents": {
    "compaction": {
      "model": {
        "providerID": "anthropic",
        "modelID": "claude-3-5-haiku-20241022"
      }
    }
  }
}

Environment variables (flag.ts:19):

  • OPENCODE_DISABLE_AUTOCOMPACT — disables auto compaction
  • OPENCODE_DISABLE_PRUNE — disables pruning
  • OPENCODE_EXPERIMENTAL_OUTPUT_TOKEN_MAX — overrides output token cap (default: 32,000)

Compaction Agent Properties

agent.ts:157

compaction: {
  name: "compaction",
  mode: "primary",
  native: true,
  hidden: true,             // not shown in UI
  prompt: PROMPT_COMPACTION,
  permission: PermissionNext.merge(
    defaults,
    PermissionNext.fromConfig({ "*": "deny" }), // ALL tools denied
    user,
  ),
  options: {},
}
  • Hidden from user-facing agent list
  • No tools available (tools: {} at call site)
  • All permissions denied
  • Uses same model as the conversation by default
  • Override via agents.compaction.model in config

Summary Table

Property Value
Mechanism Marker → filterCompacted() truncation loop
Trigger actual_tokens >= context - reserved_output_buffer
Default threshold ~96–99% of context
Token source Actual API response token count
Configurable Yes — reserved, auto, model override, env vars
Prompt System prompt + structured template (Goal/Instructions/Discoveries/Accomplished/Files)
History to LLM Full history up to compaction point, media stripped
Post-compaction Synthetic "continue or ask" user message
Overflow replay Triggering message re-created, media → text placeholder
Prune Separate, no LLM — marks old tool outputs cleared
Plugin hook experimental.session.compacting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment