PAW Review Workflow Skill
This workflow skill orchestrates the complete PAW Review process, coordinating activity skills through subagent execution to analyze pull requests and generate comprehensive review feedback.
Core Review Principles
These principles apply to ALL review stages. Activity skills reference these principles rather than duplicating them.
1. Evidence-Based Documentation
Every observation, finding, or claim MUST be supported by:
- Specific file:line references
- Concrete code patterns or test results
- Direct evidence from the codebase
NEVER include speculation, assumptions, or subjective preferences without evidence.
2. File:Line Reference Requirement
All code-related claims require specific file:line citations:
[src/module.ts:45](src/module.ts#L45)for single lines[src/module.ts:45-52](src/module.ts#L45-L52)for ranges- Multiple locations should be listed explicitly
3. No Fabrication Guardrail
CRITICAL: Do not fabricate, invent, or assume information:
- If information is unavailable, state "Not found" or "Unable to determine"
- Do not hallucinate file contents, function behaviors, or patterns
- When uncertain, document the uncertainty explicitly
4. Document, Don't Critique (Early Stages)
Understanding and baseline research stages document what exists—they do NOT:
- Evaluate quality or suggest improvements
- Identify issues or bugs
- Make recommendations
- Critique implementation decisions
Evaluation and critique happen in designated later stages only.
5. Human Control Principle
The review workflow assists human reviewers—it does NOT replace their judgment:
- Pending reviews are NEVER auto-submitted
- Final decisions on all comments rest with the human reviewer
- Generated feedback is advisory, not prescriptive
- Humans can modify, skip, or override any recommendation
6. Artifact Completeness
Each stage produces complete, well-structured artifacts:
- No placeholders or "TBD" markers
- No unresolved questions blocking downstream stages
- Each artifact is self-contained and traceable to sources
Subagent Contract
Activity skills are executed via delegated agent sessions.
Skill Loading (CRITICAL)
Every subagent MUST load their skill FIRST before executing any work:
{{#vscode}}
- Call
paw_get_skillwith the skill name (e.g.,paw-review-understanding) - Read and internalize the skill instructions
- Only then begin executing the activity
Delegation prompt must include: "First load your skill using paw_get_skill('paw-review-<skill-name>'), then execute the activity."
{{/vscode}}
{{#cli}}
- Load the skill by name (e.g.,
paw-review-understanding) - Read and internalize the skill instructions
- Only then begin executing the activity
Delegation prompt must include: "First load the paw-review-<skill-name> skill, then execute the activity."
{{/cli}}
Response Format
Upon completion, respond with artifact path and status (Success, Partial, or Blocked).
Artifact Path Confirmation
Always confirm the exact path where artifacts were written. Downstream stages depend on this.
Artifact Ownership (CRITICAL)
The orchestrating agent MUST NOT manually create artifacts that belong to activity skills. Each stage's artifacts must be produced by delegating to the designated skill. Manual population bypasses defaults, validation, and skill-specific logic (e.g., specialist selection defaults, ReviewContext field normalization).
Artifact Directory Structure
All review artifacts are stored in a consistent directory structure:
.paw/reviews/<identifier>/
├── ReviewContext.md # Stage: Understanding (initial)
├── ResearchQuestions.md # Stage: Understanding (initial)
├── CodeResearch.md # Stage: Baseline Research
├── DerivedSpec.md # Stage: Understanding (after research)
├── ImpactAnalysis.md # Stage: Evaluation (single-model mode)
├── GapAnalysis.md # Stage: Evaluation (single-model mode)
├── REVIEW-{SPECIALIST}.md # Stage: Evaluation (SoT mode, per specialist)
├── REVIEW-SYNTHESIS.md # Stage: Evaluation (SoT mode, synthesized findings)
├── CrossRepoAnalysis.md # Stage: Correlation (multi-repo only)
└── ReviewComments.md # Stage: Output (evolves: draft → assessed → finalized → posted)
ReviewComments.md Evolution:
- Draft: Initial comments generated by feedback skill
- Assessed: Assessment sections added by critic skill
- Finalized:
**Final**:markers added by feedback skill (critique response) - Posted:
**Posted**:status added by github skill
Identifier Derivation
- Single GitHub PR:
PR-<number>(e.g.,PR-123) - Multi-repo GitHub PRs:
PR-<number>-<repo-slug>per PR (e.g.,PR-123-my-api/,PR-456-my-frontend/) - Local branch: Slugified branch name (e.g.,
feature-new-auth)
Repo-slug derivation: Last path segment of repository name, lowercase, special chars removed.
Example: acme-corp/my-api-service → my-api-service
Multi-repo detection: Use when multiple workspace folders are open in VS Code OR multiple PRs provided.
Workflow Orchestration
The workflow executes stages in sequence, with each stage producing artifacts consumed by downstream stages.
Understanding Stage
Skills: paw-review-understanding, paw-review-baseline
Sequence:
-
Run
paw-review-understandingactivity- Input: PR number/URL or branch context, plus any review configuration parameters (e.g., Review Mode, Review Specialists) from the user's invocation
- Output:
ReviewContext.md,ResearchQuestions.md
-
Run
paw-review-baselineactivity- Input: ReviewContext.md, ResearchQuestions.md
- Output:
CodeResearch.md
-
Run
paw-review-understandingactivity (resume)- Input: ReviewContext.md, CodeResearch.md
- Detects CodeResearch.md exists → skips to specification derivation
- Output:
DerivedSpec.md
Stage Gate: Verify ReviewContext.md, CodeResearch.md, DerivedSpec.md exist before proceeding.
Evaluation Stage
Read Review Mode from ReviewContext.md to determine the evaluation path. If the value is not single-model, society-of-thought, or absent, report an error (Unknown Review Mode: <value>) and do not proceed.
Single-Model Mode (default)
Skills: paw-review-impact, paw-review-gap
Sequence:
-
Run
paw-review-impactactivity- Input: All understanding artifacts
- Output:
ImpactAnalysis.md
-
Run
paw-review-gapactivity- Input: All understanding + impact artifacts
- Output:
GapAnalysis.md
Stage Gate: Verify ImpactAnalysis.md, GapAnalysis.md exist before proceeding.
Society-of-Thought Mode
Engine: paw-sot (loaded into orchestrator session, not as subagent)
When Review Mode is society-of-thought, load the paw-sot skill directly and invoke it with a review context constructed from ReviewContext.md fields and understanding artifacts:
| Review Context Field | Source |
|---|---|
type | diff |
coordinates | Diff: git diff <base-commit>...<head-commit>; Artifacts: ReviewContext.md, CodeResearch.md, DerivedSpec.md paths |
output_dir | .paw/reviews/<identifier>/ |
specialists | Review Specialists value from ReviewContext.md |
interaction_mode | Review Interaction Mode value from ReviewContext.md |
interactive | Review Interactive value from ReviewContext.md |
specialist_models | Review Specialist Models value from ReviewContext.md |
After paw-sot completes orchestration and synthesis, proceed to the Output stage.
Error handling: If paw-sot skill cannot be loaded, report error to user — do not fall back to single-model silently.
Stage Gate: Verify REVIEW-SYNTHESIS.md exists before proceeding.
Cross-Repository Correlation Stage (Multi-Repo Only)
Skill: paw-review-correlation
Condition: Only run when multiple PRs/repositories detected. Skip for single-repo reviews.
Detection Criteria (any of):
- Multiple PR artifact directories exist (e.g.,
PR-123-repo-a/,PR-456-repo-b/) - Multiple workspace folders open (detected via multiple
.gitdirectories) - ReviewContext.md contains
related_prsentries
Sequence:
- Run
paw-review-correlationactivity- Input: All per-repo evaluation artifacts (ImpactAnalysis.md + GapAnalysis.md in single-model mode, or REVIEW-SYNTHESIS.md in SoT mode)
- Output:
CrossRepoAnalysis.md(in primary repo's artifact directory)
Stage Gate: Verify CrossRepoAnalysis.md exists before proceeding to Output stage.
Skip Behavior: For single-repo reviews, proceed directly to Output stage without running correlation.
Output Stage
Skills: paw-review-feedback, paw-review-critic, paw-review-github
The Output stage uses an iterative feedback-critique pattern to refine comments before posting to GitHub.
Sequence:
-
Run
paw-review-feedbackactivity (Initial Pass)- Input: All prior artifacts — ReviewContext, CodeResearch, DerivedSpec, and evaluation artifacts (ImpactAnalysis + GapAnalysis in single-model mode, or REVIEW-SYNTHESIS in SoT mode), optionally CrossRepoAnalysis
- Output:
ReviewComments.mdwith draft comments (status: draft) - Does NOT post to GitHub in this pass
-
Run
paw-review-criticactivity- Input: ReviewComments.md + all prior artifacts
- Output: Assessment sections added to
ReviewComments.md - Generates Iteration Summary with Include/Modify/Skip recommendations
-
Run
paw-review-feedbackactivity (Critique Response)- Input: ReviewComments.md (with assessments) + all prior artifacts
- Detects Assessment sections → enters Critique Response Mode
- Output: Updated comments with
**Final**:markers (status: finalized) - Comments marked: "Ready for GitHub posting" or "Skipped per critique"
-
Run
paw-review-githubactivity (GitHub PRs only)- Input: ReviewComments.md with finalized comments
- Output: Pending review created on GitHub, ReviewComments.md updated with post status
- Only posts comments marked "Ready for GitHub posting"
- Skipped comments remain in artifact but NOT posted
- Skipped for non-GitHub contexts (provides manual posting instructions instead)
Stage Gate: Verify all comments have **Final**: markers before GitHub posting.
Human Control Point: The pending review is created but NOT submitted. Human reviewer:
- Reviews generated comments in GitHub UI
- Can see full comment history in ReviewComments.md (original → assessment → updated)
- Can manually add skipped comments if they disagree with critique
- Modifies, adds, or removes comments as needed
- Submits review when satisfied
Terminal Behavior
Upon workflow completion, report:
- Artifact locations (all generated files in
.paw/reviews/<identifier>/) - GitHub PRs: Pending review ID and comment counts (e.g., "Pending review created: Review ID 12345678, 6 comments posted, 2 skipped per critique")
- Non-GitHub: Manual posting instructions location
- Multi-repo reviews: Cross-repo findings summary (interface contracts analyzed, mismatches found, deployment order)
- Comment evolution summary: original comments generated, modified per critique, skipped per critique
- Next steps for the human reviewer (review comments in GitHub UI, edit as needed, submit when ready)
Cross-Repository Support
If multiple repositories or PRs are detected:
- Identify which repositories have changes
- Determine the primary repository (where changes originate)
- For each repository, run the Understanding and Evaluation stages independently
- Run Cross-Repository Correlation stage to synthesize findings across repos
- In the Output stage, incorporate cross-repo findings into review comments
- Note cross-repo dependencies in comments using notation:
(See also: owner/other-repo#NNN)
