askill
observability-alert-manager

observability-alert-managerSafety 100Repository

Configure Grafana alerts for Claude Code anomalies and thresholds. Use when setting up monitoring alerts for sessions, errors, context usage, or subagents.

0 stars
1.2k downloads
Updated 2/5/2026

Package Files

Loading files...
SKILL.md

Observability Alert Manager

Configure and manage Grafana alerts for Claude Code monitoring using enhanced telemetry.

Data Source

Primary: {job="claude_code_enhanced"} in Loki

Operations

create-alert

Define new alert rule. Parameters: name, query (LogQL), threshold, duration, severity, notification.

list-alerts

Show all configured alerts and their status.

test-alert

Simulate alert conditions.

delete-alert

Remove alert rule.

Pre-built Alert Templates

Session Alerts

  1. Long Session Duration: Session >1 hour

    {job="claude_code_enhanced", event_type="session_end"} | json | duration_seconds > 3600
    
  2. High Turn Count: Session >50 turns

    {job="claude_code_enhanced", event_type="session_end"} | json | turn_count > 50
    
  3. Session Error Spike: >5 errors in session

    {job="claude_code_enhanced", event_type="session_end"} | json | error_count > 5
    

Error Alerts

  1. High Error Rate: >5 errors/hour

    count_over_time({job="claude_code_enhanced", event_type="tool_result", status="error"} [1h]) > 5
    
  2. Specific Tool Failures: Bash errors

    count_over_time({job="claude_code_enhanced", event_type="tool_result", status="error", tool="Bash"} [1h]) > 3
    

Context Alerts

  1. High Context Usage: >80% context window

    {job="claude_code_enhanced", event_type="context_utilization"} | json | context_percentage > 80
    
  2. Auto Compaction Triggered: Context full

    {job="claude_code_enhanced", event_type="context_compact", trigger="auto"}
    

Subagent Alerts

  1. Excessive Subagent Spawning: >10 subagents/session
    {job="claude_code_enhanced", event_type="session_end"} | json | subagents_spawned > 10
    

Activity Alerts

  1. Telemetry Staleness: No data >10min

    absent_over_time({job="claude_code_enhanced"} [10m])
    
  2. Unusual Activity Spike: >100 tool calls/hour

    count_over_time({job="claude_code_enhanced", event_type="tool_call"} [1h]) > 100
    

Prompt Pattern Alerts

  1. Debugging Session Spike: Many debugging prompts
    count_over_time({job="claude_code_enhanced", event_type="user_prompt", pattern="debugging"} [1h]) > 10
    

Example Alert Configurations

Create High Error Rate Alert

create-alert \
  --name "High Error Rate" \
  --query 'count_over_time({job="claude_code_enhanced", event_type="tool_result", status="error"} [1h]) > 5' \
  --severity warning \
  --notification slack

Create Context Usage Alert

create-alert \
  --name "High Context Usage" \
  --query '{job="claude_code_enhanced", event_type="context_utilization"} | json | context_percentage > 80' \
  --severity info \
  --notification email

Create Session Duration Alert

create-alert \
  --name "Long Session Warning" \
  --query '{job="claude_code_enhanced", event_type="session_end"} | json | duration_seconds > 3600' \
  --severity info \
  --notification dashboard

Grafana Alert Setup

Via Grafana UI

  1. Navigate to Alerting → Alert rules
  2. Create new rule with Loki data source
  3. Enter LogQL query from templates above
  4. Configure conditions and notifications

Via API

curl -X POST http://localhost:3000/api/ruler/grafana/api/v1/rules/claude-code \
  -H "Content-Type: application/json" \
  -u admin:admin \
  -d '{
    "name": "claude-code-alerts",
    "rules": [
      {
        "alert": "HighErrorRate",
        "expr": "count_over_time({job=\"claude_code_enhanced\", status=\"error\"} [1h]) > 5",
        "for": "5m",
        "labels": {"severity": "warning"},
        "annotations": {"summary": "High error rate detected"}
      }
    ]
  }'

Notification Channels

  • Slack: Webhook integration
  • Email: SMTP configuration
  • PagerDuty: Incident management
  • Dashboard: On-screen annotations

Alert Severity Levels

LevelUse Case
criticalImmediate action required
warningNeeds attention soon
infoInformational, no action needed

Scripts

  • scripts/create-alert.sh - Create new alert
  • scripts/list-alerts.sh - List all alerts
  • scripts/test-alerts.sh - Test alert conditions
  • scripts/import-alert-templates.sh - Import all pre-built templates

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

95/100Analyzed 2/12/2026

A comprehensive and well-structured skill for configuring Grafana alerts based on Claude Code telemetry. It provides actionable LogQL queries, CLI command examples, and API setup instructions, making it highly practical for observability setups.

100
95
80
95
95

Metadata

Licenseunknown
Version-
Updated2/5/2026
Publishermajiayu000

Tags

apillmobservabilitypromptingtesting