Workflow Debugging

Debug agentic workflow runs using gh aw audit and gh aw logs commands.

Core Commands

Audit a Specific Run

Investigate a single workflow run with comprehensive error detection:

gh aw audit <run-id-or-url> --parse -v

Accepts:

Numeric run ID: 21005890162
GitHub Actions URL: https://github.com/owner/repo/actions/runs/21005890162
Job URL: https://github.com/owner/repo/actions/runs/21005890162/job/9876543210
Job URL with step: https://github.com/owner/repo/actions/runs/21005890162/job/9876543210#step:7:1

What it does:

Downloads artifacts and logs to .github/aw/logs/run-<id>/
Detects errors and warnings
Analyzes MCP tool usage statistics
Generates detailed Markdown report
Extracts specific step output (if job URL with step)

Output location: .github/aw/logs/run-<run-id>/

Download Multiple Runs

Analyze patterns across multiple workflow executions:

gh aw logs [workflow] --count <N> --parse

Common options:

--count 10 - Download last 10 runs
--start-date -1w - Last week's runs
--end-date -1d - Until yesterday
--engine claude - Filter by engine (claude/codex/copilot)
--firewall - Filter runs with firewall enabled
--safe-output create-issue - Filter by safe output type
--parse - Generate Markdown reports
--json - JSON output format

Output location: .github/aw/logs/ (configurable with -o)

Debugging Workflow

Step 1: Audit the Run

Start with the audit command to get a comprehensive overview:

gh aw audit <run-url> --parse -v

Review the generated report for:

✅ Success indicators
🟡 Warnings
❌ Errors
Token usage and performance metrics
Job status and duration
Tool usage statistics

Step 2: Examine Logs

Navigate to the downloaded logs:

cd .github/aw/logs/run-<id>/

Key files:

agent-stdio.log - Full agent execution log (search here for errors)
aw_info.json - Workflow metadata and configuration
workflow-logs/ - GitHub Actions job logs
mcp-logs/gateway.md - MCP Gateway status and requests
mcp-logs/mcp-gateway.log - Raw MCP Gateway logs
sandbox/firewall/logs/access.log - Firewall access logs (if enabled)
safe_output.jsonl - Agent's final output (if available)

Step 3: Search for Common Issues

Use the quick scan script for rapid error detection:

python3 scripts/quick_scan.py .github/aw/logs/run-<id>/

Or search manually for specific patterns:

MCP server failures:

grep -E "mcp:.*failed" agent-stdio.log

DNS resolution errors:

grep "dns error.*Name does not resolve" agent-stdio.log

OAuth/authentication issues:

grep "WARN codex_rmcp_client::oauth" agent-stdio.log

Tool availability errors:

grep -i "tool.*not available\|tool.*failed" agent-stdio.log

Firewall blocks:

grep "TCP_DENIED" sandbox/firewall/logs/access.log

Step 4: Check MCP Gateway

Review MCP Gateway logs to verify server connectivity:

cat mcp-logs/gateway.md

Look for:

✓ Successfully loaded servers
🔍 RPC request/response pairs
⚠️ HTTP errors (404, 500, etc.)
✓ Tools list responses

Step 5: Analyze Root Cause

Consult the common errors reference for known patterns:

cat references/common_errors.md

This document catalogs:

MCP server failures (DNS, OAuth, session)
Firewall issues
Agent execution errors
GitHub Actions problems

Step 6: Document Findings

Create an issue to document the problem:

gh issue create \
  --repo <owner/repo> \
  --title "<concise-issue-title>" \
  --body "<detailed-description>"

Include:

Workflow run URL
Summary of the issue
Evidence from logs (error messages)
Root cause analysis
Impact assessment
Reproduction steps
Suggested fixes

Common Patterns

Silent MCP Failures

Symptom: Workflow shows green but agent couldn't use MCP tools

Detection:

gh aw audit <run-id> -v
grep -E "mcp:.*failed" .github/aw/logs/run-<id>/agent-stdio.log

Causes:

DNS resolution failure (host.docker.internal)
OAuth token issues
MCP Gateway not reachable
Session not found errors

Reference: See references/common_errors.md for detailed patterns

False Success

Symptom: Workflow completed successfully but didn't produce expected results

Investigation:

Check for MCP server failures (tools unavailable)
Check for firewall blocks (network requests failed)
Review agent output for errors
Verify safe outputs were created

Network Issues

Detection:

grep "TCP_DENIED\|TAG_NONE" sandbox/firewall/logs/access.log

Causes:

Domain not in firewall allowlist
DNS resolution through proxy failed
Network timeout

Tips

Green doesn't mean success: Always audit the logs even if the workflow shows as successful. Many failures are silent.

Use audit first: The audit command provides a comprehensive overview and is faster than manually downloading and examining logs.

Check all MCP servers: If one MCP server fails, check if others also failed—this indicates a systemic issue like DNS or networking.

Firewall logs are crucial: When debugging network issues, always check firewall access logs for blocked domains.

Look for patterns: Use the logs command to download multiple runs and identify patterns across executions.

Reference common errors: Before deep investigation, check references/common_errors.md for known patterns and solutions.

Resources

scripts/quick_scan.py

Rapid error detection script that scans for common issues:

MCP server failures
DNS resolution errors
OAuth/keyring warnings
Tool availability errors
Firewall blocks
MCP Gateway session errors

Usage:

python3 scripts/quick_scan.py <log-directory>

references/common_errors.md

Comprehensive catalog of common error patterns with:

Error signatures and patterns
Search commands for detection
Root cause explanations
Impact assessments
Known symptoms

debug-workflowSafety 100Repository

Package Files

Workflow Debugging

Core Commands

Audit a Specific Run

Download Multiple Runs

Debugging Workflow

Step 1: Audit the Run

Step 2: Examine Logs

Step 3: Search for Common Issues

Step 4: Check MCP Gateway

Step 5: Analyze Root Cause

Step 6: Document Findings

Common Patterns

Silent MCP Failures

False Success

Network Issues

Tips

Resources

scripts/quick_scan.py

references/common_errors.md

Install

AI Quality Score

Metadata

Tags

debug-workflowSafety 100Repository ShareFavorite skill

Package Files

Workflow Debugging

Core Commands

Audit a Specific Run

Download Multiple Runs

Debugging Workflow

Step 1: Audit the Run

Step 2: Examine Logs

Step 3: Search for Common Issues

Step 4: Check MCP Gateway

Step 5: Analyze Root Cause

Step 6: Document Findings

Common Patterns

Silent MCP Failures

False Success

Network Issues

Tips

Resources

scripts/quick_scan.py

references/common_errors.md

Install

AI Quality Score

Metadata

Tags

debug-workflowSafety 100Repository