Phased Review
Multi-stage implementation review pipeline with parallel sub-agent reviews, severity-based fix autonomy, and gated test verification.
Architecture
Stage 0 — Baseline Verification + Test Coverage Snapshot
Stage 1 — Parallel Code + Architecture Review (2 sub-agents)
Stage 2 — Synthesize Code/Architecture Findings
Stage 3 — Fix Code/Architecture Findings + Re-run Tests
Stage 4 — Simplicity Review (1 sub-agent)
Stage 5 — Fix Simplicity Findings + Re-run Tests
Stage 6 — Documentation Review (1 sub-agent)
Stage 7 — Fix Documentation Findings + Re-run Tests
Stage 8 — Parallel Security Review (3 sub-agents) ← BLOCKED until 3, 5, 7 pass
Stage 9 — Synthesize Security Findings
Stage 10 — Fix Security Findings + Re-run Tests
Stage 11 — Final Verification + Completion Validation
Design Principles
- Tests gate every fix stage. Tests run at Stage 0, then re-run after Stages 3, 5, 7, 10, and 11. No stage proceeds if tests fail.
- Security is always last. Stage 8 is blocked until Stages 3, 5, and 7 complete with passing tests.
- Parallel within stages, sequential between stages. Stage 1 dispatches its two sub-agents in parallel; Stage 8 dispatches its three sub-agents in parallel. Stage 8 itself is strictly blocked until Stages 3, 5, and 7 complete with passing tests.
- Autonomous fixes with escalation. Fix stages apply Critical, High, and Medium findings. Escalate to user only if a fix would change design intent or core functionality.
- Language agnostic. Baseline detection probes for test runners across ecosystems.
- No git operations. No branches, commits, or PRs. The calling agent or user handles git workflow.
- Centralized review log. All findings written to a single file for traceability.
Step 1: Scope Mode Selection
Parse the user's request to determine scope mode. If ambiguous, ask.
| Mode | Stages Run | Use Case |
|---|---|---|
full | 0, 1-11 | Complete pre-release validation |
code-only | 0, 1-3, 11 | Code quality pass — security explicitly out of scope |
security | 0, 8-10, 11 | Security-focused review only |
simplicity | 0, 4-5, 11 | YAGNI / over-engineering check |
docs | 0, 6-7, 11 | Documentation completeness review |
Default: full
Store the selected mode — it determines which stages run and what the completion checklist validates.
If mode is code-only, inform the user: "Note: code-only mode does not include security review. Run with security or full mode for security coverage."
Step 2: Review Log Setup
Create a review log file in .claude/memory/reviews/. This directory is gitignored (under .claude/memory/) so review logs are never committed to the repo.
Create the directory if it doesn't exist: mkdir -p .claude/memory/reviews
Filename: phased-review-YYYY-MM-DD.md (use current date)
If a log with today's date already exists, append a counter: phased-review-YYYY-MM-DD-2.md
Write the log header:
# Phased Review Log
- **Date:** [YYYY-MM-DD]
- **Mode:** [selected mode]
- **Stages:** [list of stage numbers for this mode]
- **Project:** [project name / working directory]
Step 3: Execute Stages
Run only the stages included in the selected mode, in order. Each stage writes results to the review log.
Stage 0 — Baseline Verification
Always runs in every mode.
Follow the detection and execution probes in references/baseline-detection.md to:
- Detect and run the test command. Record total, passing, failing, skipped, and exit code.
- Detect and run linting/typechecking (if available). Record pass/fail per tool. Skip tools that aren't present.
- Detect and run coverage measurement (if available). Record percentage. If no coverage tool found, note "not measured."
- Check CLAUDE.md or project config for custom test/lint commands — prefer those if specified.
Write to review log:
## Stage 0 — Baseline
- Tests: [X] passing, [Y] failing, [Z] skipped (exit code [N])
- Lint: [tool]: [PASS/FAIL] (or SKIPPED if not available)
- Typecheck: [tool]: [PASS/FAIL] (or SKIPPED if not available)
- Coverage: [X%] (tool: [name]) or "not measured"
- Test command: `[detected command]`
- Status: **[PASS/FAIL]**
GATE: If tests fail (exit code != 0), STOP. Report failures and tell the user they must fix baseline failures before review can begin. Do not proceed to any further stage.
Stage 1 — Parallel Code + Architecture Review
Modes: full, code-only
Launch two sub-agents in parallel using the Task tool. Use the prompts from references/sub-agent-prompts.md.
Sub-Agent A — Code Review:
subagent_type: "workflow-toolkit:code-reviewer"
Sub-Agent B — Architecture Review:
subagent_type: "compound-engineering:review:architecture-strategist"
Both agents must output findings categorized as Critical / High / Medium / Low with:
- Finding ID (C1, C2... for code; A1, A2... for architecture)
- File path and line reference
- Description
- Recommended fix
Write raw outputs to review log under ## Stage 1 — Raw Findings (Code) and ## Stage 1 — Raw Findings (Architecture).
Stage 2 — Synthesize Code/Architecture Findings
Modes: full, code-only
- Read both sub-agent outputs from Stage 1
- Deduplicate findings referencing the same code location or issue
- Assign consolidated IDs: CA-001, CA-002, etc.
- Group by severity: Critical > High > Medium > Low
Write to review log:
## Stage 2 — Code/Architecture Findings (Consolidated)
### Critical
- CA-001: [description] — [file:line]
### High
- CA-002: [description] — [file:line]
### Medium
- CA-003: [description] — [file:line]
### Low (informational — do not fix)
- CA-004: [description] — [file:line]
**Totals:** [X] Critical, [Y] High, [Z] Medium, [W] Low
Stage 3 — Fix Code/Architecture Findings + Re-run Tests
Modes: full, code-only
- Fix all Critical findings
- Fix all High findings
- Fix all Medium findings
- Escalate to user any fix that would change design intent or core functionality — do not apply autonomously
- Re-run the test command detected in Stage 0
- If tests fail after fixes, debug and resolve before proceeding
Write to review log:
## Stage 3 — Code/Architecture Fixes
- **Fixed:** [list of CA-IDs with one-line fix descriptions]
- **Escalated:** [list of CA-IDs with reason for escalation]
- **Deferred:** [Low findings — not fixed, informational only]
- Tests after fix: [X] passing, [Y] failing
- Status: **[PASS/FAIL]**
GATE: Tests must pass before proceeding.
Stage 4 — Simplicity Review
Modes: full, simplicity
Launch one sub-agent using the prompt from references/sub-agent-prompts.md.
subagent_type: "compound-engineering:review:code-simplicity-reviewer"
The agent categorizes findings as:
- Should Apply — clear simplification, no functional loss
- Consider — judgment call, context-dependent
- Skip — informational only
Write to review log under ## Stage 4 — Simplicity Findings.
Stage 5 — Fix Simplicity Findings + Re-run Tests
Modes: full, simplicity
- Apply all Should Apply findings autonomously
- For Consider findings, apply only if clearly beneficial; otherwise note as deferred
- Re-run tests
Write to review log:
## Stage 5 — Simplicity Fixes
- **Applied:** [list with descriptions]
- **Deferred:** [list with reasons]
- Tests after fix: [X] passing, [Y] failing
- Status: **[PASS/FAIL]**
GATE: Tests must pass before proceeding.
Stage 6 — Documentation Review
Modes: full, docs
Launch one sub-agent using the prompt from references/sub-agent-prompts.md.
subagent_type: "workflow-toolkit:ops-docs-generator"
Output: list of missing or outdated documentation with specific recommendations and draft content where possible.
Write to review log under ## Stage 6 — Documentation Findings.
Stage 7 — Fix Documentation Findings + Re-run Tests
Modes: full, docs
- Apply documentation fixes (update README, add missing docs, fix outdated content)
- Re-run tests to ensure no accidental code modifications
- If tests fail, something was accidentally changed — investigate and fix
Write to review log:
## Stage 7 — Documentation Fixes
- **Updated:** [list of files modified]
- **Created:** [list of new files, if any]
- Tests after fix: [X] passing, [Y] failing
- Status: **[PASS/FAIL]**
GATE: Tests must pass before proceeding.
Stage 8 — Parallel Security Review (3 Personas)
Modes: full, security
BLOCKING CONDITION: In full mode, this stage MUST NOT begin until Stages 3, 5, and 7 are ALL complete with passing tests. Verify all three gates passed before proceeding. In security mode (which skips 1-7), proceed directly after Stage 0 passes.
Launch three sub-agents in parallel using the Task tool. Use the prompts from references/sub-agent-prompts.md.
Sub-Agent E — Offensive Security (Red Team):
subagent_type: "compound-engineering:review:security-sentinel"
Thinks like an attacker. Finds exploitation paths, proves attack vectors with concrete PoC inputs, identifies the highest-impact vulnerabilities. IDs: OT1, OT2, etc.
Sub-Agent F — Defensive Security (Technical/Code):
subagent_type: "security-scanning:security-auditor"
Defense-in-depth mindset. Secure coding patterns, input validation at every layer, encryption implementation, security headers, DevSecOps integration. IDs: DF1, DF2, etc.
Sub-Agent G — Security Architecture / Auditor:
subagent_type: "security-scanning:threat-modeling-expert"
STRIDE analysis, attack tree construction, data flow diagrams, trust boundary mapping, risk scoring, and residual risk documentation. IDs: SA1, SA2, etc.
All three agents output findings as Critical / High / Medium / Low with finding IDs, file references, descriptions, and recommended fixes.
Write raw outputs to review log under:
## Stage 8 — Raw Findings (Offensive)## Stage 8 — Raw Findings (Defensive/Technical)## Stage 8 — Raw Findings (Security Architecture)
Stage 9 — Synthesize Security Findings
Modes: full, security
- Read all three sub-agent outputs from Stage 8 (offensive, defensive, architecture)
- Deduplicate — findings from different personas that identify the same underlying issue get merged (note which personas flagged it)
- Assign consolidated IDs: SEC-001, SEC-002, etc.
- Classify each finding:
- Actionable — can and should be fixed in this review cycle
- Informational — noted for awareness or future work
- Deferred — requires design changes beyond this review's scope
Write to review log:
## Stage 9 — Security Findings (Consolidated)
### Critical
- SEC-001 (actionable): [description] — [file:line]
### High
- SEC-002 (actionable): [description] — [file:line]
- SEC-003 (informational): [description — reason for deferral]
### Medium
- SEC-004 (deferred): [description — requires design change]
### Low (informational — do not fix)
- SEC-005: [description]
**Totals:** [X] Critical, [Y] High, [Z] Medium, [W] Low
**Actionable:** [N], **Informational:** [N], **Deferred:** [N]
Stage 10 — Fix Security Findings + Re-run Tests
Modes: full, security
- Fix all actionable Critical, High, and Medium security findings
- Add security-specific tests where applicable (input validation, injection tests, boundary checks)
- Escalate to user if a fix would change design intent
- Re-run tests including any newly added security tests
Write to review log:
## Stage 10 — Security Fixes
- **Fixed:** [list of SEC-IDs with descriptions]
- **New tests added:** [count and brief descriptions]
- **Informational/Deferred:** [list with justifications]
- Tests after fix: [X] passing, [Y] failing
- Status: **[PASS/FAIL]**
GATE: Tests must pass.
Stage 11 — Final Verification + Completion Validation
Always runs in every mode.
11.1 — Re-run Full Test Suite
Execute the test command from Stage 0. Record final counts.
11.2 — Re-run Coverage (if measured at baseline)
Execute the coverage command from Stage 0. Record final percentage.
11.3 — Re-run Lint/Typecheck (if available at baseline)
Execute lint/typecheck commands from Stage 0. Record final status.
11.4 — Completion Validation Checklist
For the selected scope mode, verify every required stage was executed and recorded in the review log. Read the review log file and check:
full mode — ALL required:
- Stage 0: Baseline recorded with test counts and coverage
- Stage 1: Two sub-agent reviews launched (code + architecture)
- Stage 2: Consolidated findings written with severity counts
- Stage 3: Fixes applied, tests re-run and passing
- Stage 4: Simplicity review completed
- Stage 5: Simplicity fixes applied, tests re-run and passing
- Stage 6: Documentation review completed
- Stage 7: Documentation fixes applied, tests re-run and passing
- Stage 8: Three security sub-agents launched (offensive, defensive, architecture)
- Stage 9: Security findings consolidated with severity counts
- Stage 10: Security fixes applied, tests re-run and passing
- Stage 11: Final verification passing
code-only mode: Stages 0, 1, 2, 3, 11
security mode: Stages 0, 8, 9, 10, 11
simplicity mode: Stages 0, 4, 5, 11
docs mode: Stages 0, 6, 7, 11
11.5 — Success Criteria
ALL must be true for the review to PASS:
- All tests passing (0 failures)
- No unfixed Critical findings across any stage
- No unfixed High findings (unless explicitly escalated to and accepted by the user)
- All Medium findings either fixed or documented as deferred with justification
- Lint/typecheck passing (if they passed at baseline)
- Coverage not decreased from baseline (if measured at baseline)
- Every stage required by the selected mode has an entry in the review log
11.6 — Write Final Summary
Write to review log:
## Stage 11 — Final Verification
### Test Results
- Baseline: [X] passing, [Y] failing → Final: [X] passing, [Y] failing
- New tests added during review: [N]
### Coverage
- Baseline: [X%] → Final: [Y%] (delta: [+/-Z%])
### Lint / Typecheck
- Baseline: [status] → Final: [status]
### Findings Summary
| Stage | Critical | High | Medium | Low | Fixed | Escalated | Deferred |
|-------|----------|------|--------|-----|-------|-----------|----------|
| Code/Arch (1-3) | ... | ... | ... | ... | ... | ... | ... |
| Simplicity (4-5) | - | - | - | - | ... | ... | ... |
| Documentation (6-7) | - | - | - | - | ... | ... | ... |
| Security (8-10) | ... | ... | ... | ... | ... | ... | ... |
| **Total** | ... | ... | ... | ... | ... | ... | ... |
### Completion Validation
- Mode: [mode]
- Stages required: [list]
- Stages completed: [list]
- Missing stages: [none / list]
### Final Status: **[PASS / FAIL]**
[If FAIL, list specific criteria not met]
11.7 — Present Summary to User
You MUST display the full Stage 11 summary directly in the conversation. Do not just write to the review log — the user needs to see the results without opening the file. Present:
- The complete findings summary table (all stages, all severities, fix/escalate/defer counts)
- Test results comparison (baseline → final)
- Coverage delta (if measured)
- Completion validation status (all stages checked off or missing stages listed)
- Final PASS/FAIL status with specific unmet criteria if FAIL
- The review log file path for full details:
.claude/memory/reviews/phased-review-[date].md
If the review PASSED, confirm clearly. If it FAILED, list every unmet criterion and what the user needs to address.
