Debug Skill

Structured debugging with expert domain knowledge. Systematic, not random.

Usage

/debug <description or error>

Philosophy

SYSTEMATIC DEBUGGING

Never guess randomly
Reproduce before fixing
One hypothesis at a time
Prove the fix with a test

Workflow

┌─────────────────────────────────────┐
│ Step 1: Reproduce                   │
│ (Confirm error is reproducible)     │
└─────────────────┬───────────────────┘
                  ↓
┌─────────────────────────────────────┐
│ Step 2: Isolate                     │
│ (Smallest failing case)             │
└─────────────────┬───────────────────┘
                  ↓
┌─────────────────────────────────────┐
│ Step 3: Trace                       │
│ (Read traceback bottom-to-top)      │
└─────────────────┬───────────────────┘
                  ↓
┌─────────────────────────────────────┐
│ Step 4: Hypothesize                 │
│ (Form theory, test minimally)       │
└─────────────────┬───────────────────┘
                  ↓
┌─────────────────────────────────────┐
│ Step 5: Fix                         │
│ (Apply fix with regression test)    │
└─────────────────┬───────────────────┘
                  ↓
┌─────────────────────────────────────┐
│ Step 6: Prevent                     │
│ (Add guards, improve errors)        │
└─────────────────────────────────────┘

Step 1: Reproduce

CRITICAL: Do not proceed until you can reproduce the error.

Questions to Ask

1. What are the exact steps to reproduce?
2. What environment (Python version, OS, dependencies)?
3. What input data causes the error?
4. Is it consistent or intermittent?
5. When did it start happening?

Reproduction Script

# tmp/reproduce_bug.py
"""
Reproduction script for: [bug description]
Expected: [what should happen]
Actual: [what happens instead]
"""

# Minimal setup
from src.module import function_that_fails

# Exact input that causes error
input_data = {...}

# Trigger the bug
result = function_that_fails(input_data)

Verify Reproduction

python tmp/reproduce_bug.py
# Should see the error

Step 2: Isolate

Find the smallest case that fails:

Bisection

Full input fails → Half input fails? → Quarter input fails?
Continue until minimal failing case found

Isolation Questions

1. Which specific input field causes it?
2. Which line of code is the entry point?
3. Can we trigger it with a unit test?

Minimal Failing Test

def test_minimal_reproduction():
    """Minimal case that reproduces the bug."""
    # This should fail until bug is fixed
    result = function(minimal_input)
    assert result == expected

Step 3: Trace

Read the traceback systematically:

Bottom-to-Top Reading

Traceback (most recent call last):
  File "main.py", line 10, in <module>    ← Start of call chain
    process_data(data)
  File "processor.py", line 25, in process_data
    result = transform(data)
  File "transform.py", line 42, in transform
    return data["missing_key"]            ← ACTUAL ERROR LOCATION
KeyError: 'missing_key'                   ← ERROR TYPE AND MESSAGE

Key Questions

1. What is the exact exception type?
2. What is the failing line?
3. What are the variable values at that point?
4. What called this function?
5. What data was passed?

Add Debugging Output

# Temporary debugging (remove after fix)
import json
print(f"DEBUG: data = {json.dumps(data, indent=2, default=str)}")

Step 4: Hypothesize

Form a theory and test it minimally:

Hypothesis Template

IF [condition] THEN [error occurs] BECAUSE [mechanism]

Example:
IF data["key"] is missing THEN KeyError BECAUSE API changed response format

Test Hypothesis

# Quick test of hypothesis
if "key" in data:
    print("Key exists - hypothesis wrong")
else:
    print("Key missing - hypothesis confirmed")

Common Bug Patterns

Null/None:

# Check for None before access
if obj is None:
    # handle None case

Type Errors:

# Check types
print(f"Type of data: {type(data)}")
# Expected: dict, got: list

Async Issues:

# Race conditions, order issues
# Add logging with timestamps
import time
print(f"{time.time()}: Starting operation")

State Issues:

# Shared state modified unexpectedly
# Add state logging
print(f"State before: {state}")
# ... operation ...
print(f"State after: {state}")

Step 5: Fix

Apply the fix with a test that proves it works:

TDD the Fix

# 1. Write test that fails
def test_handles_missing_key():
    data = {"other_key": "value"}  # missing expected key
    result = function(data)
    assert result == default_value  # or appropriate handling

# 2. Run test - should fail
# pytest tests/test_fix.py -v

# 3. Apply fix
def function(data):
    return data.get("key", default_value)

# 4. Run test - should pass
# pytest tests/test_fix.py -v

Fix Principles

- Minimal change to fix the issue
- Don't refactor while fixing
- Match existing code style
- Add error handling, not just fix symptoms

Step 6: Prevent

Make this bug impossible or obvious in the future:

Regression Test

def test_regression_issue_123():
    """Regression test for issue #123 - KeyError on missing field."""
    # This test ensures the bug doesn't return
    data = {"incomplete": "data"}
    result = function(data)
    assert result is not None

Improve Error Messages

# Instead of KeyError, provide context
if "required_field" not in data:
    raise ValueError(
        f"Missing 'required_field' in data. "
        f"Got keys: {list(data.keys())}"
    )

Add Type Hints

def function(data: dict[str, Any]) -> Result:
    """Process data with required fields."""
    ...

Add Validation

from pydantic import BaseModel

class InputData(BaseModel):
    required_field: str
    optional_field: str = "default"

def function(data: dict) -> Result:
    validated = InputData(**data)  # Fails fast with clear error
    ...

Expert Mode Integration

Domain-specific debugging patterns are auto-injected:

Voice/Telephony (call-screener)

- Audio buffer underruns
- WebSocket disconnections
- TTS timing issues
- Async event ordering

Diagnostics (Tesla/autodiag)

- Data freshness issues
- Alert correlation problems
- Rule logic errors
- ETL pipeline failures

Multi-LLM (llm-council)

- Response format mismatches
- Rate limiting
- Timeout handling
- Consensus algorithm bugs

Debugging Tools

Python Debugger

# Add breakpoint
import pdb; pdb.set_trace()

# Or in Python 3.7+
breakpoint()

Logging

import logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)

logger.debug(f"Processing: {data}")

Print Debugging (temporary)

# Use tmp/ folder for debug output
from pathlib import Path
Path("tmp/debug.log").write_text(str(data))

Example

/debug "TypeError: 'NoneType' object is not subscriptable in process_user"

Step 1 - Reproduce:
→ Run process_user(user_id=123)
→ Error confirmed: TypeError on line 42

Step 2 - Isolate:
→ Error occurs when user has no profile
→ Minimal case: user with profile=None

Step 3 - Trace:
→ Line 42: return user.profile["name"]
→ user.profile is None
→ None["name"] causes TypeError

Step 4 - Hypothesize:
→ IF user.profile is None THEN error occurs
→ Test: check database - user 123 has profile=NULL

Step 5 - Fix:
→ Write test for user without profile
→ Fix: if user.profile: return user.profile["name"]
→ Test passes

Step 6 - Prevent:
→ Add regression test
→ Improve error: "User 123 has no profile"
→ Add type hint: Optional[Profile]

debugSafety 90Repository ShareFavorite skill

Package Files