Framework Audit Skill

Comprehensive governance audit for the academicOps framework.

NO RATIONALIZATION: An audit reports ALL discrepancies. Do NOT justify ignoring files as "generated", "acceptable", or "probably don't need to be tracked". Every gap is reported. The user decides what's acceptable - not the auditor.

Workflow Entry Point

IMMEDIATELY call TodoWrite with the following items, then work through each one:

TodoWrite(todos=[
  {content: "Phase 0: Run health metrics script", status: "pending", activeForm: "Running health audit"},
  {content: "Phase 1: Structure audit - compare filesystem to INDEX.md", status: "pending", activeForm: "Auditing structure"},
  {content: "Phase 2: Reference graph - invoke Skill(skill='framework') then run link audit scripts", status: "pending", activeForm: "Building reference graph"},
  {content: "Phase 3: Skill content audit - check size and actionability", status: "pending", activeForm: "Auditing skill content"},
  {content: "Phase 4: Justification audit - check specs for file references", status: "pending", activeForm: "Auditing file justifications"},
  {content: "Phase 4b: Instruction justification - verify every instruction traces to enforcement-map.md", status: "pending", activeForm: "Auditing instruction justifications"},
  {content: "Phase 5: Documentation accuracy - verify README.md flowchart vs hooks", status: "pending", activeForm": "Verifying documentation"},
  {content: "Phase 6: Regenerate indices - invoke Skill(skill='flowchart') for README.md flowchart", status: "pending", activeForm: "Regenerating indices"},
  {content: "Phase 7: Other updates", status: "pending", activeForm: "Finalizing updates"},
  {content: "Phase 8: Save audit report to $ACA_DATA/projects/aops/audit/YYYY-MM-DD-HHMMSS-audit.md", status: "pending", activeForm: "Persisting report"},
  {content: "Phase 9: Create tasks for actionable findings", status: "pending", activeForm: "Creating tasks"}
])

CRITICAL: Work through EACH phase in sequence. When a phase requires a skill, invoke it explicitly as shown below.

Specialized Workflows

Session Effectiveness Audit

Qualitative assessment of session transcripts to evaluate framework performance.

Skill(skill="audit", args="session-effectiveness /path/to/transcript.md")

Workflow defined in workflows/session-effectiveness.md.

Individual Scripts (Reference Only)

These scripts run individual checks. They are NOT a substitute for the full workflow:

uv run python scripts/audit_framework_health.py -m  # Phase 0 only
uv run python scripts/check_skill_line_count.py
uv run python scripts/check_broken_wikilinks.py
uv run python scripts/check_orphan_files.py

Phase Instructions

Phase 0: Health Metrics

Run comprehensive health audit first:

cd $AOPS && uv run python scripts/audit_framework_health.py \
  --output /tmp/health-$(date +%Y%m%d).json

This generates:

/tmp/health-YYYYMMDD.json - Machine-readable metrics
/tmp/health-YYYYMMDD.md - Human-readable report

Metrics tracked: Component counts, hook coverage, skill sizes, wikilink validation

→ Continue to Phase 1 (do not stop here)

Phase 1: Structure Audit

Compare filesystem to documentation:

Scan filesystem: find $AOPS -type f -not -path "*/.git/*" -not -path "*/__pycache__/*" | sort
Compare to INDEX.md: Flag missing or extra entries
Check cross-references: Verify → references point to existing files
Find broken wikilinks: Grep for [[...]] patterns, validate targets exist

Phase 2: Reference Graph & Link Audit

First: Invoke Skill(skill="framework") to load framework conventions for linking rules.

Then build reference graph and check linking conventions:

cd $AOPS

# Generate graph
uv run python skills/audit/scripts/build_reference_map.py

# Find orphans and violations
uv run python skills/audit/scripts/find_orphans.py

# Or use the health script for wikilink/orphan checks
uv run python scripts/check_broken_wikilinks.py
uv run python scripts/check_orphan_files.py

Linking rules to enforce (from framework skill):

Skills via invocation (Skill(skill="x")), not file paths
No backward links (children → parent)
Parents must reference children
Use wikilinks, not backticks for graph connectivity
Full relative paths in wikilinks

Phase 3: Skill Content Audit

For each $AOPS/skills/*/SKILL.md:

Size check: Must be <500 lines
Actionability test: Each section must tell agents WHAT TO DO
Content separation violations:
- ❌ Multi-paragraph "why" → move to spec
- ❌ Historical context → delete
- ❌ Reference material >20 lines → move to references/

Phase 4: Justification Audit (Files)

For each significant file in $AOPS/:

Search specs: Grep $AOPS/specs/ for references
Check core docs: JIT-INJECTION.md, README.md, INDEX.md
Classify: Justified / Implicit / Orphan

Skip: __pycache__/, .git/, individual files within skills, tests, assets

Phase 4b: Instruction Justification Audit

Every behavioral instruction injected to agents must trace to enforcement-map.md.

Unjustified instructions are bloat - they cost tokens and create confusion about what's actually enforced.

Sources to scan (files injected at SessionStart or via hooks):

FRAMEWORK-PATHS.md - core instructions
AXIOMS.md, HEURISTICS.md - principle statements
skills/*/SKILL.md - skill-specific instructions
commands/*.md - command instructions
agents/*.md - agent instructions

What constitutes a "behavioral instruction":

Imperative statements: "always do X", "never do Y", "you MUST", "you SHOULD"
Conditional rules: "when X, do Y", "if X then Y"
Workflow requirements: "invoke skill X first", "before doing X, check Y"

Validation process:

Extract behavioral instructions from each source file (look for imperatives, MUSTs, SHOULDs, "always", "never", "before", "first")
For each instruction, search enforcement-map.md for:
- Direct reference to the instruction text
- Reference to the source file + line number
- Mapping to an axiom or heuristic that covers this instruction
Classify each instruction:
- Justified: Appears in enforcement-map.md with axiom/heuristic mapping
- Implicit: Derives from a documented axiom/heuristic but not explicitly in enforcement-map.md
- Orphan: No traceability - FLAG FOR REVIEW

Example orphan (discovered in session):

FRAMEWORK-PATHS.md:35 - "When working with session logs, always invoke Skill(skill='transcript') first"
→ NOT in enforcement-map.md
→ No axiom/heuristic reference
→ ORPHAN - needs justification or removal

Output format:

### Instruction Justification Status

**Justified** (N instructions):
- FRAMEWORK-PATHS.md:78 "NEVER hardcode paths" → [[axioms/dry-modular-explicit.md]]

**Implicit** (N instructions):
- skills/python-dev/SKILL.md:42 "use uv run" → derives from [[axioms/use-standard-tools.md]]

**Orphan** (N instructions) - REQUIRES ACTION:
- FRAMEWORK-PATHS.md:35 "invoke transcript skill first for session logs" → NO JUSTIFICATION
- commands/learn.md:56 "..." → NO JUSTIFICATION

Resolution for orphans:

Create heuristic if rule is valuable
Add to enforcement-map.md with axiom/heuristic mapping
Or DELETE the instruction if it's not worth formalizing

Phase 5: Documentation Accuracy

Verify README.md flowchart reflects actual hook architecture:

Parse Mermaid for hook names
Compare to hooks/router.py dispatch mappings
Compare to settings.json hook events
Flag drift

Phase 6: Regenerate Generated Indices

Generated indices are root-level files for agent consumption (INDEX.md, enforcement-map.md, WORKFLOWS.md, SKILLS.md, AXIOMS.md, HEURISTICS.md, docs/ENFORCEMENT.md). The core loop flowchart is maintained in README.md.

Regenerate each deterministically from sources:

AXIOMS.md and HEURISTICS.md

cd $AOPS && uv run python scripts/generate_principle_indices.py

Reads axioms/ and heuristics/ folders, generates machine-readable indices sorted by priority (1-100 bands: 1-20 core, 21-40 behavioral, 41-60 domain, 61-80 derived, 81-100 experimental).

SKILLS.md

cd $AOPS && uv run python scripts/generate_skills_index.py

Scans aops-core/ and aops-tools/ for skills and commands, extracts frontmatter (name, description), and generates routing index. Preserves existing triggers from previous SKILLS.md. Reports components missing triggers.

INDEX.md

Scan $AOPS/ directory structure
Extract file purposes from frontmatter/headers
Output annotated file tree

enforcement-map.md

Hook→Axiom Declaration Convention:

Every hook that enforces an axiom MUST declare it in its module docstring:

"""
Hook description.

Enforces: current-state-machine (Current State Machine)
"""

Multiple axioms: Enforces: fail-fast-code, trust-version-control (Fail-Fast, Trust Version Control)

Derivation sources:

hooks/*.py docstrings - parse "Enforces:" lines for axiom mappings
config/claude/settings.json - deny rules map to axioms via comments
.pre-commit-config.yaml - commit-time checks

Cross-reference validation:

Parse all hooks for "Enforces:" declarations
Compare against enforcement-map.md Axiom→Enforcement table
Flag discrepancies:
- Hook declares axiom but enforcement-map.md shows "Prompt" level only
- enforcement-map.md lists hook but hook lacks "Enforces:" declaration
- Axiom has Hard/Soft Gate in enforcement-map.md but no hook declares it

Output: Table mapping each axiom to its enforcement mechanism, hook, trigger point, and level.

WORKFLOWS.md

Derive task routing from:

skills/*/SKILL.md frontmatter - what task types each skill handles
skills/*/workflows/*.md - skill-specific workflow files (e.g., skills/framework/workflows/05-feature-development.md)
agents/*.md - what workflows each agent uses
enforcement-map.md (Soft Gate Guardrails section) - type→guardrail mappings

Skill workflow discovery: Scan all skills/*/workflows/*.md files and extract:

Workflow name from filename (e.g., 05-feature-development.md → feature-development)
Purpose from first heading or frontmatter description
Parent skill from directory path (e.g., skills/framework/workflows/ → framework skill)

Include in output:

A "Skill Workflows" section listing all discovered workflow files
Path reference for each (e.g., [[skills/framework/workflows/05-feature-development]])
Brief description extracted from file

Output: Table of task types, when to use each, workflow, and skill. Plus skill workflow index.

README.md (Core Loop Flowchart)

First: Invoke Skill(skill="flowchart") to load Mermaid diagram conventions.

Regenerate the core loop flowchart section in README.md from hook architecture sources:

Parse hooks/router.py for dispatch mappings (event→handler)
Parse config/claude/settings.json for hook event registrations
Parse hooks/*.py for hook implementations and "Enforces:" declarations
Generate Mermaid flowchart following flowchart skill conventions:
- Use decision diamonds for conditional logic
- Apply classDef for semantic coloring (hooks, skills, tools, outcomes)
- Group by execution phase (SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, Stop)
- Label edges with event types and conditions

Structure:

Vertical main flow showing session lifecycle
Horizontal insertion points for each hook event
Subgraphs for phase groupings

Location: Replace the "Core Loop" section in README.md (after the Quick Start section)

docs/ENFORCEMENT.md

Derive practical enforcement guide from:

docs/ENFORCEMENT.md (existing mechanism ladder content - preserve)
specs/enforcement.md "Component Responsibilities" section - root cause model

Root Cause Analysis section (append to end):

Copy root cause definition from spec
Copy responsibility tables (Pre/Execution/Post phases)
Copy root cause categories
Add failure→responsibility mapping examples from observed patterns

Each generated index must include header:

> **Generated by audit skill** - Do not edit manually.

Phase 7: Other Updates

Fix README.md: Update tables including sub-workflows (see below)
Report orphans: Flag for human review (do NOT auto-delete)
Report violations: List with file:line refs

Sub-Workflow Extraction for README.md

Skills with multiple workflows/modes MUST have each sub-workflow documented separately in the Skills table.

Detection: For each skills/*/SKILL.md:

Grep for ^## Workflow: or ^## Mode headers
Check for workflows/ subdirectory with separate workflow files
Check for ## Modes section listing multiple invocation patterns

Output format (add third column to Skills table):

| Skill            | Purpose                     | Sub-workflows                                |
| ---------------- | --------------------------- | -------------------------------------------- |
| session-insights | Session transcript analysis | Current (default), Batch, Issues             |
| audit            | Framework governance        | Full audit (default), Session effectiveness  |
| tasks            | Task lifecycle              | View/archive/create (default), Email capture |

Rules:

Mark default workflow with "(default)"
List workflows in order they appear in SKILL.md
If only one workflow exists, leave sub-workflows column as "—"
Extract workflow names from ## Workflow: X headers or workflows/*.md filenames

Phase 8: Persist Report (MANDATORY)

Every audit MUST save a written report to $ACA_DATA/projects/aops/audit/.

# Create directory if needed
mkdir -p "$ACA_DATA/projects/aops/audit"

# Generate timestamped filename (format: YYYY-MM-DD-HHMMSS-audit.md)
REPORT_PATH="$ACA_DATA/projects/aops/audit/$(date +%Y-%m-%d-%H%M%S)-audit.md"

Use the Write tool to save the complete audit report (see Report Format below) to $REPORT_PATH.

Report file MUST include:

YAML frontmatter with date, duration, and summary stats
All phase results from Phase 0-7
Clear pass/fail status for each validation criterion

After writing, confirm: Audit report saved to: [path]

Phase 9: Create Tasks for Actionable Findings

Create tasks for findings that require human action.

For each finding from Phases 0-7 that requires action:

Classify finding type using the mapping below
Create task with appropriate metadata via tasks MCP
Track task IDs for summary

Finding Type → Issue Mapping

Finding Type	Priority	Issue Type	Labels
Broken wikilinks	P2	bug	audit,documentation
Orphan files	P3	chore	audit,cleanup
Skill >500 lines	P2	chore	audit,refactor
Explanatory content in skill	P2	chore	audit,refactor
Missing from INDEX.md	P3	chore	audit,documentation
Orphan instruction (no enforcement-map.md trace)	P2	bug	audit,governance
README.md flowchart drift	P2	bug	audit,documentation
Hook→Axiom mismatch	P2	bug	audit,governance

Task Creation Pattern

mcp__plugin_aops-core_tasks__create_task(
    title="[Finding Type]: [specific details]",
    type="task",
    priority=[2|3],
    tags=["audit", "[category]"],
    body="[context from audit]"
)

Skip Conditions

Do NOT create issues for:

Regenerated indices (Phase 6 actions) - already handled
Pass status findings - no action needed
Justified files (Phase 4) - no action needed
Implicit files (Phase 4) - acceptable, no action needed

Output Summary

After creating tasks, add to audit report:

### Tasks Created

Created N tasks:

- ns-xxx: Broken wikilink: [[foo.md]] in bar.md
- ns-yyy: Orphan file: docs/old.md
- ns-zzz: Skill over limit: skills/big/SKILL.md

Report Format

See [[references/report-format]] for the complete report template and validation criteria.

auditSafety 95Repository

Package Files

Framework Audit Skill

Workflow Entry Point

Specialized Workflows

Session Effectiveness Audit

Individual Scripts (Reference Only)

Phase Instructions

Phase 0: Health Metrics

Phase 1: Structure Audit

Phase 2: Reference Graph & Link Audit

Phase 3: Skill Content Audit

Phase 4: Justification Audit (Files)

Phase 4b: Instruction Justification Audit

Phase 5: Documentation Accuracy

Phase 6: Regenerate Generated Indices

AXIOMS.md and HEURISTICS.md

SKILLS.md

INDEX.md

enforcement-map.md

WORKFLOWS.md

README.md (Core Loop Flowchart)

docs/ENFORCEMENT.md

Phase 7: Other Updates

Sub-Workflow Extraction for README.md

Phase 8: Persist Report (MANDATORY)

Phase 9: Create Tasks for Actionable Findings

Finding Type → Issue Mapping

Task Creation Pattern

Skip Conditions

Output Summary

Report Format

Install

AI Quality Score

Metadata

Tags

auditSafety 95Repository ShareFavorite skill

Package Files

Framework Audit Skill

Workflow Entry Point

Specialized Workflows

Session Effectiveness Audit

Individual Scripts (Reference Only)

Phase Instructions

Phase 0: Health Metrics

Phase 1: Structure Audit

Phase 2: Reference Graph & Link Audit

Phase 3: Skill Content Audit

Phase 4: Justification Audit (Files)

Phase 4b: Instruction Justification Audit

Phase 5: Documentation Accuracy

Phase 6: Regenerate Generated Indices

AXIOMS.md and HEURISTICS.md

SKILLS.md

INDEX.md

enforcement-map.md

WORKFLOWS.md

README.md (Core Loop Flowchart)

docs/ENFORCEMENT.md

Phase 7: Other Updates

Sub-Workflow Extraction for README.md

Phase 8: Persist Report (MANDATORY)

Phase 9: Create Tasks for Actionable Findings

Finding Type → Issue Mapping

Task Creation Pattern

Skip Conditions

Output Summary

Report Format

Install

AI Quality Score

Metadata

Tags

auditSafety 95Repository