---
id: "concept-transcript-compaction"
type: "concept"
source_timestamps: ["00:19:29"]
tags: ["cost-management", "context-window"]
related: ["concept-predictive-token-budgeting", "prereq-llm-token-economics"]
definition: "Automatically summarizing or truncating older conversation history to save tokens while ensuring the full immutable log is persisted elsewhere."
sources: ["s46-anthropic-25b-leak"]
sourceVaultSlug: "s46-anthropic-25b-leak"
originDay: 46
---
# Transcript Compaction

## Definition
Automatically summarizing or truncating older conversation history to save tokens, while ensuring the **full immutable log is persisted elsewhere**.

## How It Works in [[entity-claude-code-d46|Claude Code]]
The system monitors token usage and, upon hitting a configurable threshold, **auto-compacts** the conversation. Crucially:

- **Recent entries are kept intact.**
- **Older entries are discarded or summarized.**

## The Critical Safeguard
To prevent data loss, the underlying transcript store **tracks whether the full history has been persisted elsewhere** (audit log / object store / dual-logged system events). Compaction is only safe when the full record exists somewhere durable.

## Why It Matters
As conversations grow, token costs scale linearly or worse (see [[prereq-llm-token-economics]]). Compaction keeps the LLM's immediate context window **lean and cost-effective** without destroying the audit trail.

## Pairs With
[[concept-predictive-token-budgeting]] — together they form the cost-management layer of a production agent harness.

## Validation (Enrichment)
Ubiquitous. Auto-summarization in ChatGPT/Claude APIs and most agent frameworks works similarly, with full logs persisted separately for replay and audit.