askill
do-test-coverage-audit

do-test-coverage-auditSafety 95Repository

Forensic test coverage audit - exhaustive detection of complexity sources and assessment of whether tests exist at the right level. Use when reviewing test quality, identifying testing gaps, or auditing test strategy. Produces detailed accounting for test-recommendations skill.

3 stars
1.2k downloads
Updated 2/6/2026

Package Files

Loading files...
SKILL.md

Test Coverage Audit

Forensic analysis of test coverage quality. Not just "do you have tests" but "are you testing the right things at the right level?"

This skill produces an exhaustive audit report. For recommendations based on this report, use the test-recommendations skill. For implementation planning, use test-implementation-plan.

Philosophy

The Testing Pyramid

         ╱╲
        ╱E2E╲          Few, slow, high-confidence
       ╱──────╲
      ╱ Integ  ╲       Medium count, medium speed
     ╱──────────╲
    ╱    Unit    ╲     Many, fast, focused
   ╱──────────────╲

Comprehensive Testing Level Definitions: concepts/testing-levels.md

LevelTestsSpeedScopeConfidence
UnitManyFastSingle function/classLogic correctness
IntegrationMediumMediumComponent boundariesPieces work together
E2EFewSlowFull user journeySystem actually works

Testing at the Right Level

Wrong level → wasted effort, false confidence, or fragile tests

SymptomProblemFix
500 unit tests, login brokenMissing e2eAdd e2e for critical paths
All e2e, CI takes 2 hoursOver-reliance on slow testsPush more to unit/integration
Tests break on every refactorTesting implementation, not behaviorTest contracts, not internals
High coverage, bugs slip throughTesting wrong thingsFocus on user-facing behavior

Common AI/LLM Testing Mistakes

When AI generates tests, it often makes systematic errors. Read: concepts/llm-testing-mistakes.md

MistakeWhat It Looks LikeWhy It's Harmful
Tautological testsexpect(mock).toHaveBeenCalled() after mock()Tests nothing real
Over-mockingEvery dependency mockedTests mocks, not code
Happy path onlyNo error/edge casesMisses real failures
Testing implementationBreaks on refactorFragile, not behavioral

Audit Process

Phase 1: Complexity Source Detection

Goal: Create an exhaustive inventory of everything that needs testing.

1.1 Architecture Detection

Is this a microservices/distributed system?

Read: detection/microservices.md

SignalDetection Method
Docker Composels docker-compose*.yml
Kubernetesfind . -name "*.yaml" | xargs grep "kind: Deployment"
Service URLs in envgrep -E ".*_URL=.*_HOST=" .env*
Multiple repos/servicesDirectory structure analysis

Output:

### Architecture Classification
- Type: [Monolith | Modular Monolith | Microservices | Serverless]
- Services detected: [list with protocols]
- Inter-service communication: [HTTP | gRPC | Message Queue | None]
- Contract testing present: [Yes/No]

1.2 Data Interaction Detection

What data does this system touch?

Read: detection/data-interactions.md

CategoryDetection
DatabasesGrep for ORM imports, connection strings
CachesGrep for Redis/Memcached clients
File systemGrep for fs/pathlib operations
User configLook for config loading patterns
SecretsCheck for secret manager integrations

Output:

### Data Interactions
| Category | Technology | Locations | Tested? |
|----------|------------|-----------|---------|
| Database | PostgreSQL/SQLAlchemy | models/*.py | ✅/❌ |
| Cache | Redis | services/cache.py | ✅/❌ |
| Files | S3/boto3 | storage/*.py | ✅/❌ |
| Config | pydantic/settings | config.py | ✅/❌ |
| Secrets | AWS SecretsManager | auth/*.py | ✅/❌ |

1.3 External API Detection

What external services does this call?

Read: detection/external-apis.md

CategoryDetection
HTTP clientsGrep for requests/axios/fetch
Payment SDKsGrep for stripe/paypal
Auth providersGrep for oauth/auth0/cognito
Cloud servicesGrep for boto3/gcloud/azure
WebhooksGrep for webhook endpoints

Output:

### External API Integrations
| Service | SDK/Client | Criticality | Error Handling? | Tested? |
|---------|------------|-------------|-----------------|---------|
| Stripe | stripe-python | Critical | ⚠️ Partial | ❌ |
| SendGrid | sendgrid | High | ❌ None | ❌ |
| Auth0 | auth0-python | Critical | ✅ Yes | ✅ |

1.4 Interactive/User Input Detection

Does this require user interaction for testing?

Read: concepts/interactive-testing.md

PatternTesting Approach
CLI promptsPTY/pexpect testing
Shell completionsCompletion script testing
TUI (full-screen)Virtual terminal (pyte)
Desktop GUIPlatform-specific (Playwright/XCTest)
Device-specificHardware test farms or mocks

Output:

### Interactive Components
| Component | Type | Can Test in CI? | Current Approach |
|-----------|------|-----------------|------------------|
| Setup wizard | CLI prompts | ✅ (pexpect) | ❌ Untested |
| Tab completion | Shell integration | ✅ (script) | ❌ Untested |
| Dashboard | Full-screen TUI | ⚠️ (with pyte) | ❌ Untested |

Phase 2: Detect Project Type & Language

2.1 Project Type Detection

Identify the scenario to set testing expectations:

SignalProject TypeScenario Reference
bin/, CLI entry point, argparseCLI Toolscenarios/cli.md
React/Vue/Angular, pages/, components/Web Frontendscenarios/web-frontend.md
Express/FastAPI/Rails, routes/Web Backend/APIscenarios/web-backend.md
Both frontend + backendFull Stackscenarios/fullstack.md
npm package, library exportsLibrary/SDKscenarios/library.md
iOS/Android, mobile frameworksMobile Appscenarios/mobile.md
Dockerfile, k8s manifests, terraformInfrastructurescenarios/infrastructure.md
agents/, prompts/, LLM callsAI/Agent Systemscenarios/ai-agents.md
Airflow DAGs, Spark jobs, ETLData Pipelinescenarios/data-pipelines.md
Kafka, WebSockets, real-time streamsReal-time Systemscenarios/realtime-systems.md
Firmware, HAL, microcontrollersEmbedded/IoTscenarios/embedded-iot.md
Electron, Qt, WPF, native GUIDesktop Appscenarios/desktop-apps.md
manifest.json, Chrome/Firefox extensionBrowser Extensionscenarios/browser-extensions.md
Unity, Unreal, game engineGame Developmentscenarios/game-development.md
Solidity, smart contracts, Web3Blockchain/Web3scenarios/blockchain.md

Read the appropriate scenario file for testing expectations specific to that project type.

2.2 Language/Framework Detection

LanguageReference
Pythonlanguages/python.md
TypeScript/JavaScriptlanguages/typescript.md
Golanguages/go.md
Rustlanguages/rust.md
Java/Kotlinlanguages/java.md
Rubylanguages/ruby.md

Phases 2-5: Forensic Analysis

For the detailed forensic analysis (test inventory, coverage mapping, quality assessment, gap analysis), spawn the test-auditor agent:

Use the Task tool to spawn do:test-auditor agent:

Execute forensic test coverage analysis.

Project: [current working directory]
Framework: [detected framework from Phase 1]
Intensity: [quick|medium|thorough]

Run phases 2-5:
- Phase 3: Test Inventory
- Phase 4: Coverage Mapping
- Phase 5: Quality Assessment
- Phase 6: Gap Analysis

Output: TEST-AUDIT-<timestamp>.md in .agent_planning/

The agent will complete the audit report with all remaining phases.


Output Format

The audit produces a comprehensive report:

# Test Coverage Audit Report
**Project**: [name]
**Date**: [date]
**Auditor**: Claude

## Executive Summary
**Overall Health**: [Healthy | Needs Work | Critical Gaps]
**Architecture**: [type]
**Coverage Distribution**: Unit n% | Integration n% | E2E n%
**Critical Issues**: [count]

---

## 1. Architecture Analysis
[From Phase 1.1]

## 2. Complexity Source Inventory
### 2.1 Data Interactions
[From Phase 1.2]

### 2.2 External APIs
[From Phase 1.3]

### 2.3 Interactive Components
[From Phase 1.4]

---

## 3. Test Inventory
[From Phase 3]

---

## 4. Coverage Matrix
[From Phase 4]

---

## 5. Quality Assessment
### 5.1 Red Flags Detected
[From Phase 5.1]

### 5.2 Quality Checklist Results
[From Phase 5.2]

---

## 6. Gap Analysis

### P0 - Critical (Must Fix)
[From Phase 6]

### P1 - Significant (Should Fix)
[From Phase 6]

### P2 - Minor (Nice to Have)
[From Phase 6]

---

## 7. Risk Assessment

| Risk | Impact | Likelihood | Current Mitigation |
|------|--------|------------|-------------------|
| Payment failures undetected | High | Medium | ❌ None |
| Auth bypass possible | Critical | Low | ⚠️ Partial |

---

## 8. Appendix

### A. Files Analyzed
[List of all files examined]

### B. Test File Inventory
[Complete list of test files]

### C. Detection Commands Used
[Commands run during audit]

Intensity Levels

LevelScopeDepth
QuickArchitecture + high-level gaps10-15 min
Medium+ Quality assessment + coverage matrix30-45 min
Thorough+ Test-by-test review + risk analysis60-90 min

Reference Documents

Concepts

TopicReference
Testing levels definedconcepts/testing-levels.md
AI/LLM testing mistakesconcepts/llm-testing-mistakes.md
Interactive system testingconcepts/interactive-testing.md
Unknown UI testingconcepts/unknown-ui-testing.md

Detection

AreaReference
Microservices detectiondetection/microservices.md
Data interaction detectiondetection/data-interactions.md
External API detectiondetection/external-apis.md

Scenarios (15)

CategoryReference
CLI Toolsscenarios/cli.md
Web Frontendscenarios/web-frontend.md
Web Backend/APIscenarios/web-backend.md
Full Stackscenarios/fullstack.md
Library/SDKscenarios/library.md
Mobile Appscenarios/mobile.md
Infrastructurescenarios/infrastructure.md
AI/Agent Systemscenarios/ai-agents.md
Data Pipeline/ETLscenarios/data-pipelines.md
Real-time Systemscenarios/realtime-systems.md
Embedded/IoTscenarios/embedded-iot.md
Desktop Appscenarios/desktop-apps.md
Browser Extensionscenarios/browser-extensions.md
Game Developmentscenarios/game-development.md
Blockchain/Web3scenarios/blockchain.md

Languages (6)

LanguageReference
Pythonlanguages/python.md
TypeScript/JSlanguages/typescript.md
Golanguages/go.md
Rustlanguages/rust.md
Java/Kotlinlanguages/java.md
Rubylanguages/ruby.md

Integration

This skill is invoked as a dimension of /do:plan audit:

  • Trigger: "audit tests", "test coverage audit", "testing audit"
  • Can run alongside other audit dimensions

Related Skills

SkillPurpose
test-recommendationsGenerate strategic test plan from audit
test-implementation-planCreate execution plan with testability refactoring
do:add-testsWrite specific tests
do:setup-testingSet up test framework
do:tdd-workflowTest-first development

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

72/100Analyzed 2/18/2026

Well-structured forensic test coverage audit skill with clear methodology, phased approach, and comprehensive output format. References external documents and spawns agents for detailed work, making it a reference-style skill that still provides actionable detection methods. The deep path nesting and agent spawning indicate internal tool usage, but the content is generic and technically sound. Includes when-to-use guidance, structured tags, and clear process flow.

95
88
78
58
72

Metadata

Licenseunknown
Version-
Updated2/6/2026
Publisherbrandon-fryslie

Tags

apici-cddatabasegithub-actionsllmsecuritytesting