askill
aws-troubleshoot

aws-troubleshootSafety 95Repository

AWS service troubleshooting patterns. Use for EC2, ECS, Lambda, CloudWatch, RDS issues.

291 stars
5.8k downloads
Updated 2/15/2026

Package Files

Loading files...
SKILL.md

AWS Troubleshooting Expertise

Investigation Methodology

  1. Identify the AWS resource/service involved
  2. Check resource status using describe functions
  3. Review CloudWatch logs for errors
  4. Check CloudWatch metrics for anomalies
  5. Analyze configuration for misconfigurations
  6. Synthesize and recommend

CloudWatch Logs Strategy

Partition First (CRITICAL)

Never dump all logs. Use aggregation queries first:

# Error rate over time
filter @message like /ERROR/
| stats count(*) as errors by bin(5m)

# Top error messages
filter @message like /Exception/
| stats count(*) by @message
| sort count desc
| limit 10

# Latency percentiles
stats pct(@duration, 50) as p50, pct(@duration, 99) as p99 by bin(5m)

# Unique error types
filter @message like /ERROR/
| parse @message /(?<error_type>[\w.]+Exception)/
| stats count(*) by error_type

Query Flow

  1. Statistics first: Get error counts, distributions
  2. Identify time windows: Find when errors spiked
  3. Sample from spikes: Get specific examples
  4. Compare to baseline: Query same period yesterday/last week

Service-Specific Patterns

EC2 Issues

SymptomFirst CheckTypical Cause
Unreachabledescribe_ec2_instanceSecurity group, stopped, status check failed
Performanceget_cloudwatch_metrics (CPUUtilization)CPU exhaustion, network saturation
Disk fullget_cloudwatch_metrics (DiskSpaceUtilization)Logs, temp files

Key CloudWatch metrics for EC2:

  • CPUUtilization
  • NetworkIn, NetworkOut
  • DiskReadOps, DiskWriteOps
  • StatusCheckFailed

Lambda Issues

SymptomFirst CheckTypical Cause
TimeoutCloudWatch logsExternal call slow, cold start, insufficient memory
Permission deniedCloudWatch logsIAM role missing permissions
Memory errorCloudWatch metricsMemory allocation too low
Cold startsCloudWatch logs + metricsProvisioned concurrency needed

Key CloudWatch metrics for Lambda:

  • Invocations
  • Duration
  • Errors
  • Throttles
  • ConcurrentExecutions

CloudWatch Insights for Lambda:

# Cold start analysis
filter @type = "REPORT"
| stats avg(@initDuration) as avg_cold_start,
        count(@initDuration) as cold_starts,
        count(*) as total_invocations
        by bin(5m)

# Timeout analysis
filter @message like /Task timed out/
| stats count(*) by bin(5m)

ECS/Fargate Issues

SymptomFirst CheckTypical Cause
Task failedlist_ecs_tasksContainer crash, resource limits, image pull
Service unhealthylist_ecs_tasksHealth check failing, target group issues
Slow scalingCloudWatch metricsInsufficient capacity, service limits

Investigation flow:

  1. list_ecs_tasks - See task status and health
  2. Check stopped reason in task description
  3. Review CloudWatch logs for the task
  4. Check container insights metrics

RDS Issues

SymptomFirst CheckTypical Cause
Connection refusedget_rds_instance_statusSecurity group, stopped, maintenance
Slow queriesCloudWatch metricsCPU, IOPS, connections
Storage fullCloudWatch metricsData growth, logs, snapshots

Key CloudWatch metrics for RDS:

  • CPUUtilization
  • DatabaseConnections
  • ReadIOPS, WriteIOPS
  • FreeStorageSpace
  • ReadLatency, WriteLatency

Common AWS Errors

Permission Errors

AccessDeniedException
UnauthorizedAccess

→ Check IAM role/policy attached to the service

Throttling

Throttling
Rate exceeded
TooManyRequestsException

→ Implement exponential backoff, request limit increase

Resource Not Found

ResourceNotFoundException
NoSuchEntity

→ Verify resource name, region, account

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

78/100Analyzed 2/19/2026

High-quality AWS troubleshooting reference with detailed CloudWatch queries and service-specific patterns. Well-structured with tables, code examples, and methodology. Lacks a clear "when to use" trigger section and has no hands-on steps, but serves as excellent reference material. Located in a proper skills folder with tags for discoverability.

95
85
85
70
75

Metadata

Licenseunknown
Version-
Updated2/15/2026
Publisherincidentfox

Tags

observabilitysecurity