Context Compression Strategies
When agent sessions generate millions of tokens, compression becomes mandatory. The naive approach is aggressive compression to minimize tokens per request. The correct optimization target is tokens per task: total tokens consumed to complete a task, including re-fetching costs when compression loses critical information.
When to Activate
Activate this skill when:
- Agent sessions exceed context window limits
- Codebases exceed context windows (5M+ token systems)
- Designing conversation summarization strategies
- Debugging cases where agents "forget" what files they modified
Core Approaches
1. Anchored Iterative Summarization (Recommended)
Maintain structured, persistent summaries with explicit sections. When compression triggers, summarize only newly-truncated span and merge with existing summary.
Key insight: Structure forces preservation. Dedicated sections act as checklists.
2. Opaque Compression
Compressed representations optimized for reconstruction fidelity. Achieves 99%+ compression but sacrifices interpretability.
3. Regenerative Full Summary
Generate detailed summaries on each compression. Readable but may lose details across repeated compression cycles.
The Artifact Trail Problem
Artifact trail integrity is universally weak (2.2-2.5 out of 5.0). Coding agents need to know:
- Which files were created/modified
- What changed in each file
- Function names, variable names, error messages
Solution: Separate artifact index or explicit file-state tracking in agent scaffolding.
Structured Summary Sections
## Session Intent
[What the user is trying to accomplish]
## Files Modified
- auth.controller.ts: Fixed JWT token generation
- config/redis.ts: Updated connection pooling
## Decisions Made
- Using Redis connection pool instead of per-request
- Retry logic with exponential backoff
## Current State
- 14 tests passing, 2 failing
## Next Steps
1. Fix remaining test failures
2. Run full test suite
Compression Triggers
| Strategy | Trigger Point | Trade-off |
|---|---|---|
| Fixed threshold | 70-80% utilization | Simple but may compress too early |
| Sliding window | Keep last N turns + summary | Predictable context size |
| Importance-based | Compress low-relevance first | Complex but preserves signal |
| Task-boundary | Compress at logical completions | Clean summaries |
Compression Performance
| Method | Compression Ratio | Quality Score |
|---|---|---|
| Anchored Iterative | 98.6% | 3.70 |
| Regenerative | 98.7% | 3.44 |
| Opaque | 99.3% | 3.35 |
The 0.7% additional tokens retained by structured summarization buys 0.35 quality points—worth it when re-fetching costs matter.
Guidelines
- Optimize for tokens-per-task, not tokens-per-request
- Use structured summaries with explicit file tracking sections
- Trigger compression at 70-80% context utilization
- Implement incremental merging rather than full regeneration
- Track artifact trail separately if file tracking is critical
- Monitor re-fetching frequency as a compression quality signal
Created: 2025-12-22 | Version: 1.1.0
