Debug Protocol Skill
Purpose
To replace "Shotgun Debugging" (randomly changing code) with a deterministic, scientific process. Goal: Fix the root cause, not just the symptom.
When to Use
- Trigger: When a test fails, an exception is thrown, or output is unexpected.
- Agent: Primarily used by
executionmode. - Support:
- Call
qualityto write the reproduction test. - Call
researcherif the error message is obscure/undocumented.
- Call
The Scientific Method (4-Step Protocol)
Phase 1: Observation (The MRE)
Rule: If you cannot reproduce it, you cannot fix it.
- Isolate: Create a standalone script or test case that fails 100% of the time.
- Minimize: Remove all code not strictly necessary to produce the error.
- Log: Do not guess variables. Log them.
Phase 2: Localization (The "Wolf Fence")
Rule: Divide and Conquer.
- Binary Search: Is the data wrong at the Database? No? At the API? No? At the UI?
- Trace: Follow the data flow. Find the exact line where the reality diverges from expectation.
Phase 3: Hypothesis (The "5 Whys")
Rule: Do not fix the symptom.
- Identify: "Variable X is null."
- Ask Why: "Why is X null?" -> "Because the API returned 404."
- Ask Why: "Why 404?" -> "Because the ID was undefined."
- Ask Why: "Why undefined?" -> "Because of a typo in the frontend."
- Root Cause: Typo in the frontend payload.
Phase 4: Resolution & Verification
Rule: Prove it.
- Apply Fix: Correct the root cause.
- Verify: Run the MRE from Phase 1. It must pass.
- Regression: Run the full test suite. Ensure nothing else broke.
Debugging Scratchpad Template
Copy this into context during a debug session.
## π Debug Session: [Error Name]
### 1. The Reproduction (MRE)
- [ ] Created `reproduce_issue.ts`
- [ ] Failure is deterministic (Happens every time)
- **Error Message**: `[Paste exact error]`
### 2. Localization (Wolf Fence)
- [ ] Inputs are correct? (Yes/No)
- [ ] Intermediate state correct? (Yes/No)
- **Found at**: File `X`, Line `Y`.
### 3. Root Cause Analysis (5 Whys)
1. Why? [Answer]
2. Why? [Answer]
3. **Root Cause**: [The fundamental flaw]
### 4. Resolution
- **Action**: [What did you change?]
- **Verification**: `npm test -- reproduce_issue.ts` -> β
PASS
Anti-Patterns (Must Avoid)
- β "It should work": Code does not care what it "should" do. It does what it is told. Look at what it is doing.
- β Console Log Spray: Don't leave
console.log('here')all over the code. Use structured logging or remove them after finding the bug. - β The "Blind Fix": Applying a StackOverflow solution without understanding why it applies to your specific context. (Call
researcherfirst). - β Ignoring Warning Logs: Often the error is preceded by a warning 10 lines up. Read the whole log.
Integration with Agents
qualityagent: If you are stuck on Phase 1 (Reproduction), delegate toquality. "I see the error, but I can't write a test for it. Quality Agent, please create a Cypress test for this UI bug."researcheragent: If you are stuck on Phase 3 (Hypothesis). "I have the error 'Heap Out of Memory'. Researcher, what are the common causes for this in Node.js 18?"
