Skip to content

Instantly share code, notes, and snippets.

@sam-saffron-jarvis
Created March 4, 2026 22:26
Show Gist options
  • Select an option

  • Save sam-saffron-jarvis/d705f30f0cd72647e978aae9f014c2fb to your computer and use it in GitHub Desktop.

Select an option

Save sam-saffron-jarvis/d705f30f0cd72647e978aae9f014c2fb to your computer and use it in GitHub Desktop.
Roo Code — Condensation/Compression Deep Dive

Roo Code — Compression/Condensation Analysis

Repo: https://github.com/RooVetGit/Roo-Code
Commit: 3e237e6 (v0.1.17)
Language: TypeScript (VS Code extension)


Summary

Roo Code calls its compression system "condensation". It's a full extract mechanism using a "fresh start model": all prior messages are non-destructively tagged and hidden; the new effective history is a single summary user message. A separate LLM API call performs the summarisation. Sliding window truncation exists as a fallback when condensation fails or is disabled.


Mechanism: Non-Destructive Extract ("Fresh Start Model")

src/core/condense/index.tssummarizeConversation(), getEffectiveApiHistory()

Not a piggyback. This is a full history replacement, but non-destructive:

  1. A separate API call is made with the conversation history + condensation prompt
  2. The model responds with a <analysis>...<summary>... structured text
  3. All existing messages get tagged with condenseParent: <uuid> (stored but hidden)
  4. A new user-role summary message is appended with isSummary: true and condenseId: <uuid>
  5. getEffectiveApiHistory() slices history from the summary forward — old messages never sent to API again
// tag all prior messages
const newMessages = messages.map(msg => {
  if (!msg.condenseParent) return { ...msg, condenseParent: condenseId }
  return msg
})
// append summary
newMessages.push(summaryMessage) // role: "user", isSummary: true

Why user-role? So the next assistant turn starts cleanly — the model "reads" its own briefing as a user instruction.

Non-destructive: Messages are never deleted from storage. If the user rewinds past the condensation point, cleanupAfterTruncation() clears orphaned condenseParent references and the full history is restored.


Threshold

src/core/context-management/index.tsmanageContext(), willManageContext()

Condensation fires when either condition is true:

contextPercent >= effectiveThreshold
OR
prevContextTokens > allowedTokens

Where:

  • contextPercent = (100 * prevContextTokens) / contextWindow
  • allowedTokens = contextWindow * (1 - TOKEN_BUFFER_PERCENTAGE) - reservedTokens
    • TOKEN_BUFFER_PERCENTAGE = 0.1 (10% buffer)
    • reservedTokens = maxTokens || ANTHROPIC_DEFAULT_MAX_TOKENS

Default settings

  • autoCondenseContext = true (enabled by default)
  • autoCondenseContextPercent = 100 (fires at 100% fill — in practice, the allowedTokens check fires first, around ~85-90%)

The allowedTokens formula effectively means condensation fires at approximately 90% of the context window minus the max output tokens. With Claude Sonnet 3.5 (200k context, 8k max output): fires at ~200k * 0.9 - 8k = 172k tokens.

Configurability

Three levels of control:

  1. Global toggle: autoCondenseContext (on/off)
  2. Global threshold: autoCondenseContextPercent — user-configurable, clamped to [5, 100]%
  3. Per-profile threshold: profileThresholds[currentProfileId] — overrides global per-mode profile. -1 means inherit global.

Custom condensing prompt: customCondensingPrompt setting — replaces entire supportPrompt.default.CONDENSE template if set.


Trigger Points

Three scenarios trigger condensation:

1. Automatic — post-turn

src/core/task/Task.tsrecursivelyMakeClineRequests()

After each assistant turn, token count is checked. If over threshold, manageContext() runs.

2. Forced — on API context window error

src/core/task/Task.ts ~line 3850, constant FORCED_CONTEXT_REDUCTION_PERCENT = 75

If the API returns a context-window-exceeded error (checkContextWindowExceededError()), condensation is forced immediately with threshold overridden to 75% (i.e., "compress enough to keep 75% of the window").

3. Manual — user-triggered

User can click a button in the UI to manually trigger condensation at any time. Does not include environment details in the summary (they'll be freshly injected on the next turn).


Fallback: Sliding Window Truncation

src/core/context-management/index.tstruncateConversation()

When condensation fails (API error) or is disabled and tokens exceed allowedTokens, sliding window truncation runs:

  • Tags 50% of visible messages (excluding the first) with truncationParent: <uuid>
  • Inserts a isTruncationMarker user message at the boundary: "[Sliding window truncation: N messages hidden to reduce context]"
  • Also non-destructive: messages can be restored on rewind

Also non-destructive: uses truncationParent UUID linking, same pattern as condenseParent.


Summary Message Structure

After condensation, the stored summary message looks like:

role: "user"
isSummary: true
condenseId: <uuid>
content: [
  { type: "text", text: "<analysis>...\n<summary>...</summary>" },
  // optionally:
  { type: "text", text: "<system-reminder>\n<command>...</command>\n</system-reminder>" },
  { type: "text", text: "<system-reminder>\n[Folded file context]\n</system-reminder>" },
  { type: "text", text: "<environment_details>...</environment_details>" },  // auto only
]

Three notable extras appended to the summary content:

<command> block preservation

extractCommandBlocks() — extracts <command>...</command> XML from the first message

Active shell commands / workflows from the original task are extracted and re-injected wrapped in <system-reminder> tags. Survives across multiple consecutive condensations.

Folded file context

src/core/condense/foldedFileContext.tsgenerateFoldedFileContext()

Files read during the session (filesReadByRoo) are processed through tree-sitter to extract only function signatures and class declarations (not bodies). These are injected as additional <system-reminder> blocks alongside the summary. Capped at 50,000 characters total.

This means the model retains structural awareness of previously-read source files even after condensation.

Environment details (auto-trigger only)

When condensation fires automatically mid-turn (isAutomaticTrigger=true), the current environment details (open files, terminal output, etc.) are appended to the summary content. When triggered manually, they're omitted (fresh details will arrive on the next user turn).


Prompt Architecture

Two-layer design:

Layer Role Content
System SUMMARY_PROMPT Short anti-hallucination guard: "SYSTEM OPERATION, not user message, no tool calls"
Final user message supportPrompt.default.CONDENSE 9-section structured template with full example

The user-turn prompt is fully replaceable via customCondensingPrompt. The system prompt is hardcoded.

The summarisation call uses the same apiHandler as the main conversation (same model). Tools are passed in the metadata but tool calls are blocked via the system prompt.


Edge Cases

Scenario Handling
Condensed recently (summary at end of getMessagesSinceLastSummary) Returns error if only 1 message since last summary
Orphaned tool_use without matching tool_result injectSyntheticToolResults() adds fake result: "Context condensation triggered. Tool execution deferred."
Providers requiring tools param for tool blocks transformMessagesForCondensing() converts all tool blocks to plain text
Image blocks in history maybeRemoveImageBlocks() strips if provider doesn't support
Multiple condensations Nested condenseParent handled: only new messages (without existing condenseParent) get tagged; prior tagged messages left as-is
Rewind past condensation cleanupAfterTruncation() clears orphaned condenseParent/truncationParent refs → full history restored
Condensation API call fails Falls through to sliding window truncation if tokens still exceed allowedTokens
Context window error from API Forces condensation at 75% threshold; on repeated failure, truncation

Configuration Reference

Setting Default Range Description
autoCondenseContext true bool Master toggle
autoCondenseContextPercent 100 5–100 % of context window that triggers condensation
profileThresholds[profileId] undefined 5–100 or -1 Per-profile override; -1 = inherit global
customCondensingPrompt "" string Replaces entire user-turn condensation prompt

Key Files

File Purpose
src/core/condense/index.ts Core: summarizeConversation(), getEffectiveApiHistory(), truncateConversation(), SUMMARY_PROMPT
src/shared/support-prompt.ts supportPrompt.default.CONDENSE — the 9-section user-turn prompt
src/core/context-management/index.ts manageContext(), willManageContext(), TOKEN_BUFFER_PERCENTAGE, threshold logic
src/core/condense/foldedFileContext.ts Tree-sitter file signature extraction for post-summary injection
src/core/task/Task.ts Trigger points: auto (~line 4000), forced (~line 3850), UI flow
src/core/context/context-management/context-error-handling.ts API error detection for forced condensation
src/core/message-manager/index.ts MessageManager — rewind/undo handling, cleanup of orphaned parent refs

Roo Code Compression Prompt

Repo: https://github.com/RooVetGit/Roo-Code
Commit: 3e237e6 (v0.1.17)
Key files:

  • src/core/condense/index.tsSUMMARY_PROMPT, summarizeConversation(), getEffectiveApiHistory()
  • src/shared/support-prompt.tssupportPrompt.default.CONDENSE (the user-turn template)
  • src/core/context-management/index.tsmanageContext(), threshold logic

Two-Layer Prompt Architecture

Roo Code makes a separate LLM call for condensation using two layers:

Layer 1 — System prompt (SUMMARY_PROMPT)

src/core/condense/index.ts, const SUMMARY_PROMPT

You are a helpful AI assistant tasked with summarizing conversations.

CRITICAL: This summarization request is a SYSTEM OPERATION, not a user message.
Your ONLY task is to analyze the conversation and produce a text summary.
Respond with text only - no tool calls will be processed.

CRITICAL: This summarization request is a SYSTEM OPERATION, not a user message.
When analyzing "user requests" and "user intent", completely EXCLUDE this summarization message.
The "most recent user request" and "next step" must be based on what the user was doing BEFORE this system message appeared.
The goal is for work to continue seamlessly after condensation - as if it never happened.

Layer 2 — User-turn condensing instructions (supportPrompt.default.CONDENSE)

src/shared/support-prompt.ts, CONDENSE.template

This is appended as the final user message in the conversation being summarised.
Users can replace it entirely via the customCondensingPrompt setting.

CRITICAL: This summarization request is a SYSTEM OPERATION, not a user message.
When analyzing "user requests" and "user intent", completely EXCLUDE this summarization message.
The "most recent user request" and "Optional Next Step" must be based on what the user was doing BEFORE this system message appeared.
The goal is for work to continue seamlessly after condensation - as if it never happened.

Your task is to create a detailed summary of the conversation so far, paying close attention to the user's explicit requests and your previous actions.
This summary should be thorough in capturing technical details, code patterns, and architectural decisions that would be essential for continuing development work without losing context.

Before providing your final summary, wrap your analysis in <analysis> tags to organize your thoughts and ensure you've covered all necessary points. In your analysis process:

1. Chronologically analyze each message and section of the conversation. For each section thoroughly identify:
   - The user's explicit requests and intents
   - Your approach to addressing the user's requests
   - Key decisions, technical concepts and code patterns
   - Specific details like:
     - file names
     - full code snippets
     - function signatures
     - file edits
   - Errors that you ran into and how you fixed them
   - Pay special attention to specific user feedback that you received, especially if the user told you to do something differently.
2. Double-check for technical accuracy and completeness, addressing each required element thoroughly.

Your summary should include the following sections:

1. Primary Request and Intent: Capture all of the user's explicit requests and intents in detail
2. Key Technical Concepts: List all important technical concepts, technologies, and frameworks discussed.
3. Files and Code Sections: Enumerate specific files and code sections examined, modified, or created. Pay special attention to the most recent messages and include full code snippets where applicable and include a summary of why this file read or edit is important.
4. Errors and fixes: List all errors that you ran into, and how you fixed them. Pay special attention to specific user feedback that you received, especially if the user told you to do something differently.
5. Problem Solving: Document problems solved and any ongoing troubleshooting efforts.
6. All user messages: List ALL user messages that are not tool results. These are critical for understanding the users' feedback and changing intent.
7. Pending Tasks: Outline any pending tasks that you have explicitly been asked to work on.
8. Current Work: Describe in detail precisely what was being worked on immediately before this summary request, paying special attention to the most recent messages from both user and assistant. Include file names and code snippets where applicable.
9. Optional Next Step: List the next step that you will take that is related to the most recent work you were doing. IMPORTANT: ensure that this step is DIRECTLY in line with the user's most recent explicit requests, and the task you were working on immediately before this summary request. If your last task was concluded, then only list next steps if they are explicitly in line with the users request. Do not start on tangential requests or really old requests that were already completed without confirming with the user first.

If there is a next step, include direct quotes from the most recent conversation showing exactly what task you were working on and where you left off. This should be verbatim to ensure there's no drift in task interpretation.

Here's an example of how your output should be structured:

<example>
<analysis>
[Your thought process, ensuring all points are covered thoroughly and accurately]
</analysis>

<summary>
1. Primary Request and Intent:
   [Detailed description]

2. Key Technical Concepts:
   - [Concept 1]
   - [Concept 2]
   - [...]

3. Files and Code Sections:
   - [File Name 1]
      - [Summary of why this file is important]
      - [Summary of the changes made to this file, if any]
      - [Important Code Snippet]
   - [File Name 2]
      - [Important Code Snippet]
   - [...]

4. Errors and fixes:
   - [Detailed description of error 1]:
      - [How you fixed the error]
      - [User feedback on the error if any]
   - [...]

5. Problem Solving:
   [Description of solved problems and ongoing troubleshooting]

6. All user messages:
   - [Detailed non tool use user message]
   - [...]

7. Pending Tasks:
   - [Task 1]
   - [Task 2]
   - [...]

8. Current Work:
   [Precise description of current work]

9. Optional Next Step:
   [Optional Next step to take]

</summary>
</example>

Please provide your summary based on the conversation so far, following this structure and ensuring precision and thoroughness in your response.

Note: Any <command> blocks from the original task will be automatically appended to your summary wrapped in <system-reminder> tags. You do not need to include them in your summary text.

There may be additional summarization instructions provided in the included context. If so, remember to follow these instructions when creating the above summary.

What Gets Sent to the Summarization API

SYSTEM:  SUMMARY_PROMPT  (anti-hallucination guard)

USER:    [all messages since last summary, tool blocks converted to text]
USER:    [condense instructions — the CONDENSE template above]

The messages being summarised are preprocessed:

  1. injectSyntheticToolResults() — adds fake tool_result for any orphaned tool_use (prevents API rejection)
  2. transformMessagesForCondensing() — converts tool_use / tool_result blocks → plain text (no tools parameter needed)
  3. maybeRemoveImageBlocks() — strips images if provider doesn't support them in history

Summary Output Format

Model is expected to produce:

<analysis>
[Reasoning / self-check — discarded by the harness]
</analysis>

<summary>
1. Primary Request and Intent: ...
2. Key Technical Concepts: ...
3. Files and Code Sections: ...
4. Errors and fixes: ...
5. Problem Solving: ...
6. All user messages: ...
7. Pending Tasks: ...
8. Current Work: ...
9. Optional Next Step: ...
</summary>

The full raw text (including <analysis>) is stored as the summary. The harness doesn't parse sections — the entire text becomes the content of the injected summary message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment