plugin: autopilot updated: 2026-01-20

Proof-of-Work

Version: 0.1.0 Purpose: Generate validation artifacts for autonomous task completion Status: Phase 1

When to Use

Use this skill when you need to:

Generate proof artifacts after task completion
Capture screenshots for UI verification
Parse and report test results
Calculate confidence scores for task validation
Determine if a task can be auto-approved

Overview

Proof-of-work is the mechanism that validates task completion. Every finished task must include verifiable artifacts that demonstrate the work was done correctly.

Proof Types by Task

Bug Fix Proof

Artifact	Required	Purpose
Git diff	Yes	Show minimal, focused changes
Test results	Yes	All tests passing
Regression test	Yes	Specific test for the bug
Error log (before/after)	Optional	Visual evidence

Feature Proof

Artifact	Required	Purpose
Screenshots	Yes	Visual verification
Test results	Yes	Functionality works
Coverage report	Yes	>= 80% coverage
Build output	Yes	Builds successfully
Deployment URL	Optional	Live demo

UI Change Proof

Artifact	Required	Purpose
Desktop screenshot	Yes	1920x1080 view
Mobile screenshot	Yes	375x667 view
Tablet screenshot	Yes	768x1024 view
Accessibility score	Yes	>= 80 Lighthouse
Visual regression	Optional	BackstopJS diff

Screenshot Capture

Playwright Pattern:

import { chromium } from 'playwright';

async function captureScreenshots(url: string, outputDir: string) {
  const browser = await chromium.launch({ headless: true });
  const context = await browser.newContext();
  const page = await context.newPage();

  // Desktop
  await page.setViewportSize({ width: 1920, height: 1080 });
  await page.goto(url);
  await page.waitForLoadState('networkidle');
  await page.screenshot({
    path: `${outputDir}/desktop.png`,
    fullPage: true,
  });

  // Mobile
  await page.setViewportSize({ width: 375, height: 667 });
  await page.goto(url);
  await page.waitForLoadState('networkidle');
  await page.screenshot({
    path: `${outputDir}/mobile.png`,
    fullPage: true,
  });

  // Tablet
  await page.setViewportSize({ width: 768, height: 1024 });
  await page.goto(url);
  await page.waitForLoadState('networkidle');
  await page.screenshot({
    path: `${outputDir}/tablet.png`,
    fullPage: true,
  });

  await browser.close();
}

Confidence Scoring

Algorithm:

interface ProofArtifacts {
  testResults?: { passed: number; total: number };
  buildSuccessful?: boolean;
  lintErrors?: number;
  screenshots?: string[];
  testCoverage?: number;
  performanceScore?: number;
}

function calculateConfidence(artifacts: ProofArtifacts): number {
  let score = 0;

  // Tests (40 points)
  if (artifacts.testResults) {
    if (artifacts.testResults.passed === artifacts.testResults.total) {
      score += 40;
    }
  }

  // Build (20 points)
  if (artifacts.buildSuccessful) {
    score += 20;
  }

  // Coverage (20 points)
  if (artifacts.testCoverage) {
    if (artifacts.testCoverage >= 80) score += 20;
    else if (artifacts.testCoverage >= 60) score += 15;
    else if (artifacts.testCoverage >= 40) score += 10;
    else score += 5;
  }

  // Screenshots (10 points)
  if (artifacts.screenshots) {
    if (artifacts.screenshots.length >= 3) score += 10;
    else if (artifacts.screenshots.length >= 1) score += 5;
  }

  // Lint (10 points)
  if (artifacts.lintErrors === 0) {
    score += 10;
  }

  return score;
}

Confidence Thresholds

Confidence	Action
>= 95%	Auto-approve (In Review -> Done)
80-94%	Manual review required
< 80%	Validation failed, iterate

Proof Summary Template

# Proof of Work

**Task**: {issue_id}
**Type**: {task_type}
**Confidence**: {score}%

## Test Results
- Total: {total}
- Passed: {passed}
- Failed: {failed}
- Coverage: {coverage}%

## Build
- Status: {status}
- Duration: {duration}

## Screenshots
- Desktop: proof/desktop.png
- Mobile: proof/mobile.png
- Tablet: proof/tablet.png

## Artifacts
- test-results.txt
- coverage.json
- build-output.txt

Examples

Example 1: Feature Proof Generation

const proof = {
  testResults: { passed: 15, total: 15 },
  buildSuccessful: true,
  lintErrors: 0,
  screenshots: ['desktop.png', 'mobile.png', 'tablet.png'],
  testCoverage: 85,
};

const confidence = calculateConfidence(proof);
// 40 (tests) + 20 (build) + 20 (coverage) + 10 (screenshots) + 10 (lint) = 100%

Example 2: Partial Proof

const proof = {
  testResults: { passed: 12, total: 15 },  // Some failing
  buildSuccessful: true,
  lintErrors: 2,
  screenshots: ['desktop.png'],
  testCoverage: 65,
};

const confidence = calculateConfidence(proof);
// 0 (tests fail) + 20 (build) + 15 (coverage) + 5 (1 screenshot) + 0 (lint errors) = 40%
// Result: Validation failed, must iterate

Best Practices

Always capture screenshots for UI work
Run full test suite, not just affected tests
Include coverage report for features
Build must pass before any proof is valid
Store proofs in session directory for debugging
Generate proof summary in markdown for Linear comments

proof-of-workSafety 100Repository

Package Files

Proof-of-Work

When to Use

Overview

Proof Types by Task

Bug Fix Proof

Feature Proof

UI Change Proof

Screenshot Capture

Confidence Scoring

Confidence Thresholds

Proof Summary Template

Examples

Example 1: Feature Proof Generation

Example 2: Partial Proof

Best Practices

Install

AI Quality Score

Metadata

Tags

proof-of-workSafety 100Repository ShareFavorite skill

Package Files

Proof-of-Work

When to Use

Overview

Proof Types by Task

Bug Fix Proof

Feature Proof

UI Change Proof

Screenshot Capture

Confidence Scoring

Confidence Thresholds

Proof Summary Template

Examples

Example 1: Feature Proof Generation

Example 2: Partial Proof

Best Practices

Install

AI Quality Score

Metadata

Tags

proof-of-workSafety 100Repository