askill
token-management

token-managementSafety 90Repository

Token budget awareness, compression strategies, and observation masking for optimal context usage

0 stars
1.2k downloads
Updated 2/3/2026

Package Files

Loading files...
SKILL.md

Token Management & Compression Strategies

Comprehensive guide for managing token budget, applying tier-appropriate compression, and optimizing context window usage.

Overview

Context window is a public good - every token competes with other information.

This skill provides:

  • Token budget monitoring strategies
  • Tier-aware compression ratios (T1: 5-10x, T2: 2-5x, T3+: minimal)
  • Observation masking rules
  • Semantic preservation guidelines (LLMLingua-2)
  • Budget overflow protocols

Token Budget Awareness (T2+)

Built-in Monitoring

# Check current project stats
opencode stats --project ""

# Check last 7 days with model breakdown
opencode stats --days 7 --models 5

# Full breakdown
opencode stats --days 30 --models 10 --tools 10

Budget Action Triggers

Heuristic Triggers

EventAction
Phase boundaryRun opencode stats, summarize to task_plan.md
3+ long tool outputsConsider notes.md offload
Error investigation >2 attemptsDocument state, check stats
Research accumulatedTransfer to notes.md
Before /clearRun stats to log, then clear

Budget Overflow Protocol

When context fills up (75%+ usage):

  1. Run assessment:

    opencode stats --project ""
    
  2. Offload research findings:

    • Transfer to notes.md
    • Keep executive summary in context
  3. Summarize completed phases:

    • Update task_plan.md with phase summaries
    • Archive detailed exploration notes
  4. Store key learnings (Load skill sia-code/decision-trace for structured format):

    uvx sia-code memory add-decision "[Category]: [Decision]. Context: [trigger]. Reasoning: [why]. Outcome: [result]."
    
  5. If still overloaded:

    • /clear and restore from task_plan.md
    • Re-establish context from plan + notes

Observation Masking (Tier-Aware)

Long outputs waste tokens. Apply tier-appropriate masking (50%+ cost savings):

Masking Rules by Tier

Output TypeT1 (Simple)T2 (Moderate)T3+ (Complex)
File >100 linesFirst 20 + last 10 + matchesError context + 20 linesFull structure
Command successExit code onlyExit code + key metricsExit code + full output
Command errorFull error + 3 linesFull error + 5 linesFull error section
Test resultsPass/fail counts+ first 3 failures+ all failures + stacks
API responseSchema onlySchema + sampleFull response
Build logsFinal 5 linesFinal 10 linesFull error section

Semantic Preservation

Always keep:

  • Function signatures
  • Error lines
  • Imports
  • Class definitions

Safe to compress:

  • Repeated patterns
  • Verbose comments
  • Whitespace

Never discard:

  • The exact line referenced in errors

Compression Strategy (Tier-Aware)


Semantic Preservation Rules (LLMLingua-2)

Core Principles

Always keep:

  • Function signatures
  • Imports
  • Class definitions
  • Error lines
  • Variable declarations (in scope)

Compress safely:

  • Repeated patterns
  • Verbose comments
  • Extensive whitespace
  • Boilerplate code

Never discard:

  • The exact line referenced in errors
  • Function/class definitions in error stack
  • Import statements causing issues

Example: T1 Compression

Original (100 lines):

# Long file with verbose comments
import os
import sys
import json

def process_data(data):
    """
    This function processes data by doing X, Y, and Z.
    It takes a data parameter and returns processed result.
    ... (50 lines of docstring) ...
    """
    # Implementation details...
    result = transform(data)
    return result

# ... 80 more lines ...

T1 Compressed:

import os, sys, json
def process_data(data):
    result = transform(data)
    return result
# ... [80 lines compressed] ...

Example: T3 Architecture Task

Original: Keep FULL

Reasoning: Architecture decisions require understanding full context, including:

  • All class relationships
  • Method signatures
  • Inheritance hierarchies
  • Complex logic flow

Context Stability

Keep Fixed

  • AGENTS.md rules (this system prompt)
  • Current task goal from task_plan.md
  • Active phase objectives

Summarize at Boundaries

  • Completed phases (executive summary in task_plan.md)
  • Exploration findings (detailed in notes.md)
  • Research notes (transfer to notes.md or sia-code memory)

Archive Aggressively

  • Old tool outputs (>3 turns ago, unless actively referenced)
  • Completed explorations
  • Resolved error investigations

Recovery Strategies

Pre-Clear Protocol

Before running /clear:

  1. Document current state:

    • Update task_plan.md with exact position
    • Log next 2 steps clearly
    • Store context-critical insights in sia-code memory
  2. Run stats:

    opencode stats --project ""
    
  3. Archive findings:

    • Transfer research to notes.md
    • Store learnings in sia-code memory
  4. Mark checkpoint in plan:

    ## Checkpoint: Before Clear
    - Position: [exact step]
    - Next: [next 2 steps]
    - Critical context: [key info]
    

Post-Clear Recovery

After /clear:

  1. Read task_plan.md:

    • Find "Checkpoint: Before Clear" or current position
    • Understand completed phases
  2. Restore TodoWrite:

    • Initialize with remaining steps
    • Mark prior phases as completed
  3. Resume from position:

    • Continue from exact step
    • Reference notes.md as needed

Best Practices

DO

✅ Monitor at phase boundaries (opencode stats) ✅ Offload research to notes.md early (not at 90%) ✅ Store learnings in sia-code memory (not in context) ✅ Match compression to tier (T1: heavy, T3: light) ✅ Keep errors at full fidelity (always)

DON'T

❌ Wait until forced to /clear (proactive offloading) ❌ Apply same compression to all tiers (tier-aware) ❌ Compress error outputs (always full) ❌ Lose investigation progress (store first, then /clear) ❌ Forget to log stats before /clear (tracking)


Quick Reference

When to Check Stats

  • ☐ Phase boundaries
  • ☐ After 3+ long tool outputs
  • ☐ Before /clear
  • ☐ Error investigation >2 attempts

Compression Ratios

  • T1: 5-10x (aggressive)
  • T2: 2-5x (moderate)
  • T3: 1-2x (light)
  • T4: None (full fidelity)
  • Errors: FULL (always)

Offload Targets

  • Research findings → notes.md
  • Key learnings → sia-code memory
  • Completed phases → task_plan.md summary
  • Old tool outputs → Archive (remove from context)

Usage

Load this skill when:

  • Context feels bloated (offload guidance)
  • Approaching token limits (overflow protocol)
  • Uncertain about compression level (tier matching)
  • Before /clear (recovery protocols)
  • Setting up new task (budget awareness)

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

90/100Analyzed 2/9/2026

An exceptionally well-structured and actionable guide for token management in agentic workflows. It provides specific CLI commands, tier-based compression logic, and clear protocols for context recovery.

90
100
80
95
95

Metadata

Licenseunknown
Version1.0.0
Updated2/3/2026
PublisherDxTa

Tags

apici-cdobservabilitypromptingtesting