E2E Tester: Human + AI Testing Workflow

A testing system that separates what users do (UI actions) from what Claude does (automated verifications). After implementing a feature, generate contextual tests where users verify the UI experience and Claude validates the backend.

Core Principle

Users do what only humans can do. Claude does everything else.

User's Job (Manual)	Claude's Job (Automated)
Navigate to URLs	Run database queries
Click buttons	Make API calls (curl)
Fill forms	Grep application logs
Visual verification	Check file system changes
Subjective feedback	Validate data integrity
Take screenshots	Pattern match outputs
Report observations	Execute any CLI command

When to Use

Use this skill when:

Completing a feature implementation (all todos marked complete)
User explicitly requests E2E testing or manual verification
Before merging significant changes
User says "test this", "verify the feature", "run e2e tests"

Prerequisites

Docker must be installed and running
First run will build the Docker image (~30-60 seconds)

Complete Workflow

1. Generate Tests (Claude does this)

After implementing a feature, generate tests with clear separation:

node cli.js generate-tests --feature "User registration" --tests '[
  {
    "title": "User can register and account is created",
    "description": "Complete registration flow with backend validation",
    "category": "integration",
    "priority": "critical",
    "userSteps": [
      {"type": "action", "instruction": "Navigate to /register"},
      {"type": "action", "instruction": "Enter email: test@example.com"},
      {"type": "action", "instruction": "Enter password: SecurePass123!"},
      {"type": "action", "instruction": "Click Create Account button"},
      {"type": "observe", "instruction": "Verify success message appears"},
      {"type": "observe", "instruction": "Verify redirected to dashboard"},
      {"type": "screenshot", "instruction": "Screenshot the dashboard"}
    ],
    "autoVerifications": [
      {
        "description": "User record created in database",
        "command": "psql -c \"SELECT id, email FROM users WHERE email='"'"'test@example.com'"'"'\"",
        "expectedPattern": "test@example\\.com",
        "expectedDescription": "Should return 1 row with the email"
      },
      {
        "description": "Password is properly hashed",
        "command": "psql -c \"SELECT password FROM users WHERE email='"'"'test@example.com'"'"'\"",
        "expectedPattern": "\\$2[aby]\\$",
        "expectedDescription": "Password should be bcrypt hashed"
      },
      {
        "description": "Welcome email queued",
        "command": "grep '"'"'Sending welcome email to test@example.com'"'"' /var/log/app.log | tail -1",
        "expectedPattern": "Sending welcome email",
        "expectedDescription": "Log should show email was queued"
      }
    ]
  }
]'

2. Start Testing Container

node cli.js start-container

This opens the webapp at http://localhost:3458.

3. User Completes Manual Testing

In the webapp, the user:

Goes through each user step (action, observe, screenshot)
Checks off completed steps
Uploads screenshots as evidence
Marks each test as Pass/Fail/Skip
Adds remarks if needed
Submits when complete

Important: The user NEVER runs commands. They only do UI actions.

4. Claude Runs Automated Verifications

After user submits, Claude:

# Get list of verifications to run
node cli.js run-verifications --session <session-id> --list

# Claude executes each command using Bash tool
# Then reports results:
node cli.js run-verifications --session <session-id> --report '[
  {
    "testId": "t_001",
    "verificationId": "a1",
    "status": "passed",
    "output": "id | email\\n1 | test@example.com",
    "matchResult": "match"
  }
]'

5. Analyze Combined Results

node cli.js get-results --session <session-id>

Returns both manual and automated results:

{
  "summary": {
    "manualTotal": 5,
    "manualPassed": 4,
    "manualFailed": 1,
    "manualSkipped": 0,
    "autoTotal": 8,
    "autoPassed": 7,
    "autoFailed": 1,
    "autoErrors": 0
  },
  "failures": [...]
}

6. Fix Failures

If there are failures, enter plan mode to analyze and fix.

Test Format

userSteps (what the USER does)

{
  "type": "action | observe | screenshot",
  "instruction": "Human-readable instruction"
}

Step types:

action: User performs an action (navigate, click, type)
observe: User visually verifies something ("Verify success message appears")
screenshot: User captures visual evidence

autoVerifications (what CLAUDE runs)

{
  "description": "Human-readable description",
  "command": "Command Claude executes",
  "expectedPattern": "Regex to match in output",
  "expectedDescription": "What output to expect"
}

CLI Commands

Command	Description
`start-container [--port N] [--no-open]`	Start Docker webapp
`stop-container`	Stop and remove container
`container-status [--logs]`	Check container health
`generate-tests --feature <desc> --tests <json>`	Create test session
`get-results --session <id> \| --latest \| --list`	Fetch results
`run-verifications --session <id> --list`	List verifications to run
`run-verifications --session <id> --report <json>`	Report verification results
`config [get\|set] [key] [value]`	Manage configuration

Data Storage

Path	Purpose
`~/.e2e-tester/config.json`	Configuration
`~/.e2e-tester/tests/<session>.json`	Test definitions
`~/.e2e-tester/feedback/<session>.json`	User + auto results
`~/.e2e-tester/images/<session>/`	Uploaded screenshots

Example Session

User: "Add user registration feature"

Claude: [Implements feature, marks todos complete]

Claude: I've implemented user registration. Let me generate E2E tests.

[Generates tests with userSteps (UI actions) and autoVerifications (DB/API checks)]

Claude: Opening the testing webapp. Please:
        1. Go through each manual step (navigate, fill form, click buttons)
        2. Check off what you observe
        3. Upload screenshots of the results
        4. Mark each test Pass/Fail
        5. Submit when done

[User completes manual testing and submits]

User: Done testing

Claude: Thanks! Now running automated verification checks...

[Executes each autoVerification command]

        Database: Found user record (id=42, email=test@example.com)
        Password: Properly hashed with bcrypt
        API: POST /register returns 201
        Logs: "User created" event logged
        Email: Welcome email not found in queue

Results:
- Manual: 5/5 passed (you verified all UI flows work)
- Automated: 4/5 passed (email service issue detected)

I found an issue with the email service. Let me investigate...

[Claude enters plan mode to fix the email issue]

Benefits of This Approach

Better UX: Users don't copy-paste commands
More Accurate: Claude runs commands exactly as specified
Full Observability: Claude sees all verification outputs
Clear Separation: Users do human things, Claude does computer things
Faster Testing: Automated checks run in seconds after submit
Better Debugging: Full context of both manual and automated results

e2e-testerSafety 90Repository

Package Files

E2E Tester: Human + AI Testing Workflow

Core Principle

When to Use

Prerequisites

Complete Workflow

1. Generate Tests (Claude does this)

2. Start Testing Container

3. User Completes Manual Testing

4. Claude Runs Automated Verifications

5. Analyze Combined Results

6. Fix Failures

Test Format

userSteps (what the USER does)

autoVerifications (what CLAUDE runs)

CLI Commands

Data Storage

Example Session

Benefits of This Approach

Install

AI Quality Score

Metadata

Tags

e2e-testerSafety 90Repository ShareFavorite skill

Package Files

E2E Tester: Human + AI Testing Workflow

Core Principle

When to Use

Prerequisites

Complete Workflow

1. Generate Tests (Claude does this)

2. Start Testing Container

3. User Completes Manual Testing

4. Claude Runs Automated Verifications

5. Analyze Combined Results

6. Fix Failures

Test Format

userSteps (what the USER does)

autoVerifications (what CLAUDE runs)

CLI Commands

Data Storage

Example Session

Benefits of This Approach

Install

AI Quality Score

Metadata

Tags

e2e-testerSafety 90Repository