Package: @anthropic-ai/claude-code v2.1.68
Source: Deobfuscated from the 12MB minified cli.js bundle.
"Want to see the unminified source? We're hiring!" — Anthropic
Claude Code has five distinct context management mechanisms:
| Mechanism | LLM? | Trigger | Scope |
|---|---|---|---|
| Auto full compact | ✅ | After turn, tokens ≥ threshold | Full history replacement |
Manual /compact |
✅ | User command | Full or partial |
| Sub-agent compact | ✅ | Before turn in sub-agent loop | Sub-agent history |
| Microcompact | ❌ | Warning threshold hit | Clears old tool results only |
| Session memory compact | ❌ | Auto-compact trigger | Uses stored session memory |
System prompt:
You are a helpful AI assistant tasked with summarizing conversations.
User prompt (appended after the full history):
Your task is to create a detailed summary of the conversation so far, paying close attention to
the user's explicit requests and your previous actions.
This summary should be thorough in capturing technical details, code patterns, and architectural
decisions that would be essential for continuing development work without losing context.
Before providing your final summary, wrap your analysis in <analysis> tags to organize your
thoughts and ensure you've covered all necessary points. In your analysis process:
1. Chronologically analyze each message and section of the conversation. For each section
thoroughly identify:
- The user's explicit requests and intents
- Your approach to addressing the user's requests
- Key decisions, technical concepts and code patterns
- Specific details like:
- file names
- full code snippets
- function signatures
- file edits
- Errors that you ran into and how you fixed them
- Pay special attention to specific user feedback that you received, especially if the user
told you to do something differently.
2. Double-check for technical accuracy and completeness, addressing each required element
thoroughly.
Your summary should include the following sections:
1. Primary Request and Intent: Capture all of the user's explicit requests and intents in detail
2. Key Technical Concepts: List all important technical concepts, technologies, and frameworks
discussed.
3. Files and Code Sections: Enumerate specific files and code sections examined, modified, or
created. Pay special attention to the most recent messages and include full code snippets
where applicable and include a summary of why this file read or edit is important.
4. Errors and fixes: List all errors that you ran into, and how you fixed them. Pay special
attention to specific user feedback that you received, especially if the user told you to do
something differently.
5. Problem Solving: Document problems solved and any ongoing troubleshooting efforts.
6. All user messages: List ALL user messages that are not tool results. These are critical for
understanding the users' feedback and changing intent.
7. Pending Tasks: Outline any pending tasks that you have explicitly been asked to work on.
8. Current Work: Describe in detail precisely what was being worked on immediately before this
summary request, paying special attention to the most recent messages from both user and
assistant. Include file names and code snippets where applicable.
9. Optional Next Step: List the next step that you will take that is related to the most recent
work you were doing. IMPORTANT: ensure that this step is DIRECTLY in line with the user's
most recent explicit requests, and the task you were working on immediately before this
summary request. [...]
If there is a next step, include direct quotes from the most recent
conversation showing exactly what task you were working on and where you left off. This should
be verbatim to ensure there's no drift in task interpretation.
[example output structure with <analysis> and <summary> XML blocks...]
[If PreCompact hook or /compact instructions provided:]
Additional Instructions:
{customInstructions}
IMPORTANT: Do NOT use any tools. You MUST respond with ONLY the <summary>...</summary> block
as your text output.
Identical structure to Prompt 1, but scoped to "RECENT portion only":
Your task is to create a detailed summary of the RECENT portion of the conversation — the
messages that follow earlier retained context. The earlier messages are being kept intact and
do NOT need to be summarized. Focus your summary on what was discussed, learned, and
accomplished in the recent messages only.
[...same 9-section structure...]
Please provide your summary based on the RECENT messages only (after the retained earlier
context), following this structure and ensuring precision and thoroughness in your response.
Used when an in-process sub-agent approaches its context limit — fires before the next turn as a user message:
You have been working on the task described above but have not yet completed it. Write a
continuation summary that will allow you (or another instance of yourself) to resume work
efficiently in a future context window where the conversation history will be replaced with
this summary. Your summary should be structured, concise, and actionable. Include:
1. Task Overview
The user's core request and success criteria
Any clarifications or constraints they specified
2. Current State
What has been completed so far
Files created, modified, or analyzed (with paths if relevant)
Key outputs or artifacts produced
3. Important Discoveries
Technical constraints or requirements uncovered
Decisions made and their rationale
Errors encountered and how they were resolved
What approaches were tried that didn't work (and why)
4. Next Steps
Specific actions needed to complete the task
Any blockers or open questions to resolve
Priority order if multiple steps remain
5. Context to Preserve
User preferences or style requirements
Domain-specific details that aren't obvious
Any promises made to the user
Be concise but complete — err on the side of including information that would prevent duplicate
work or repeated mistakes. Write in a way that enables immediate resumption of the task.
Wrap your summary in <summary></summary> tags.
After compaction, history starts with this user message:
This session is being continued from a previous conversation that ran out of context. The
summary below covers the earlier portion of the conversation.
[<analysis> block reformatted as plain text]
[<summary> block reformatted as plain text]
[If transcript available:]
If you need specific details from before compaction (like exact code snippets, error messages,
or content you generated), read the full transcript at: {transcriptPath}
[If partial compact — some messages kept verbatim:]
Recent messages are preserved verbatim.
[If auto-compact:]
Please continue the conversation from where we left off without asking the user any further
questions. Continue with the last task that you were asked to work on.
// Constants
const BcK = 200_000 // default context window for non-extended models
const wk8 = 13_000 // safety buffer
const a5Y = 20_000 // output reservation cap
function B96(model) {
// Usable window = context - output reservation
return contextWindow(model) - Math.min(maxOutputTokens(model), 20_000)
}
function fQ6(model) {
// Auto-compact fires here
const base = B96(model) - 13_000 // additional safety buffer
const override = process.env.CLAUDE_AUTOCOMPACT_PCT_OVERRIDE // 1–100
if (override) return Math.min(Math.floor(B96(model) * pct/100), base)
return base
}Example (Claude Sonnet 3.5: 200k context, 8192 max output):
B96 = 200000 - min(8192, 20000) = 191808
fQ6 = 191808 - 13000 = 178808 ≈ 89.4% of context
Warning display: shown at fQ6 - 20000 (contextWindow − maxOutput − 33k)
Blocking limit: contextWindow - 3000 (absolute hard stop)
Uses actual token count from API response (input_tokens + cache_* + output_tokens).
Before:
[user msg 1] [assistant 1] [user msg 2] [assistant 2] ... [user msg N]
bG6() flow:
- Count tokens (
tk(messages)) - Run
PreCompacthooks → may inject custom instructions into prompt - Check session memory (
QP1) — if stored summary exists and fits, skip LLM - Build API request: full history + summary prompt → model (same as conversation)
thinkingConfig: { type: "disabled" }— extended thinking turned offmaxOutputTokensOverride: 20000- Tools: read_file only
- Stream response, extract
<summary>...</summary>block - Clear
readFileState - Re-inject: recently-read files (
bM4), plan file (IP1), skills (uM4), plan mode (mM4) - Run session start hooks
- Return:
{ boundaryMarker, summaryMessages, attachments, hookResults }
After (via A66()):
[boundaryMarker: "Conversation compacted"]
[summaryMessage: JQ6(summary, ...)]
[messagesToKeep: verbatim recent msgs — partial compact only]
[attachments: re-injected files/skills/plan]
[hookResults: session start outputs]
Cache sharing (feature flag tengu_compact_cache_prefix): before calling the LLM, tries to reuse a compaction result cached from another session with the same conversation prefix. Falls back on miss.
Function Rg() — runs during message serialization, before each API call.
Constants:
const g3Y = 40_000 // protect this many tokens of recent tool results
const F3Y = 3 // always keep last N tool results intact
const B3Y = 20_000 // minimum savings threshold to bother
const eV8 = 2_000 // estimated tokens per image/documentTrigger: isAboveWarningThreshold AND clearable tool result tokens > 20k
Algorithm:
- Find all tool_use/tool_result pairs for eligible tools (bash, read_file, grep, etc.)
- Keep last
F3Y=3tool results protected always - Scan backwards: accumulate tool result sizes until >
g3Y=40ktokens counted - Everything beyond that 40k window: eligible to clear
- If eligible tokens >
B3Y=20k: strip them- Tool results →
"[Tool result cleared]"(or saved to temp file with re-read instructions) - Images/documents in user messages →
"[image]"/"[document]"
- Tool results →
- Cleared tool IDs tracked in
U96set (persists across turns)
No LLM call. Purely in-memory message transformation.
Settings UI: autoCompactEnabled toggle — "Auto-compact when context is full"
Environment variables:
| Variable | Effect |
|---|---|
DISABLE_COMPACT |
Disable ALL compaction including /compact command |
DISABLE_AUTO_COMPACT |
Disable auto-compact only; /compact still works |
DISABLE_MICROCOMPACT |
Disable microcompact |
CLAUDE_AUTOCOMPACT_PCT_OVERRIDE |
Float 1–100: trigger at this % of B96 (max: default) |
CLAUDE_CODE_MAX_OUTPUT_TOKENS |
Override model max output tokens |
CLAUDE_CODE_BLOCKING_LIMIT_OVERRIDE |
Override hard blocking limit |
CLAUDE_AFTER_LAST_COMPACT |
Fetch only session logs after last compact point |
- Compaction of compaction: if the compaction call itself returns
"compact"result (overflow during compaction), throwsContextOverflowError - Empty response: throws "Failed to generate conversation summary"
- API error response (WO prefix): re-throws with original error
- Prompt too long (Mc prefix): throws user-facing "context too large to compact"
- Auto-compact failure: silently returns
{ wasCompacted: false }— session continues with uncleaned history - Microcompact + compaction: both can fire in the same turn; microcompact runs inline during serialization, full compaction runs after
- Session memory fallback: if session memory compaction result still exceeds threshold, falls through to LLM compaction
- Streaming retry:
tengu_compact_streaming_retryflag enables retrying compaction up tok5Ytimes on stream failure
| Property | Value |
|---|---|
| Mechanism | Full history replacement (extract) |
| Threshold | contextWindow - min(maxOutputTokens, 20k) - 13k |
| Example (Sonnet 3.5) | 178,808 / 200,000 ≈ 89.4% |
| Token source | Actual API token count (not estimated) |
| Configurable | Yes — env CLAUDE_AUTOCOMPACT_PCT_OVERRIDE, settings toggle |
| Prompt | 9-section structured <analysis> + <summary> XML |
| History to LLM | Full history, no truncation |
| Model | Same mainLoopModel as conversation |
| Max output | 20,000 tokens (hardcoded override) |
| Extended thinking | Disabled during compaction |
| Post-compaction | Boundary marker + continuation message + re-injected files/skills/plan |
| Microcompact | Separate, no LLM, inline tool result clearing |
| Hook | PreCompact — can inject instructions into summary prompt |
| Cache sharing | Experimental — reuse across sessions with same prefix |