askill
ssmd-health-run

ssmd-health-runSafety 95Repository

Procedures for running ssmd health checks and interpreting results. Covers quick triage, data pipeline health, infrastructure checks (NATS, Postgres, Redis), connector/archiver deep dives via port-forward, and Cloud Monitoring queries. Use when checking environment health, investigating degradation, or verifying health after deployments.

1 stars
1.2k downloads
Updated 2/20/2026

Package Files

Loading files...
SKILL.md

ssmd-health-run

Quick Triage

# Non-Running pods (excludes Completed CronJobs)
kubectl get pods -n ssmd --no-headers | grep -v -E 'Running|Completed'

# Not-ready pods
kubectl get pods -n ssmd --no-headers | grep Running | grep -v '1/1\|2/2'

# Recent error/warning events
kubectl get events -n ssmd --sort-by='.lastTimestamp' --field-selector type!=Normal | tail -20

Deployments

Actual deployment names in the ssmd namespace:

DeploymentComponent
kalshi-crypto-connectorKalshi connector
kraken-futures-connectorKraken connector
polymarket-connectorPolymarket connector
kalshi-crypto-archiverKalshi archiver
kraken-futures-archiverKraken archiver
polymarket-archiverPolymarket archiver
ssmd-operatorCRD operator
ssmd-data-tsAPI server (port 8080)
ssmd-cdcCDC pipeline
ssmd-redisRedis

StatefulSets: ssmd-postgres-0

Connector / Archiver Health

Rust containers have no wget/curl — use kubectl port-forward:

# Connector health (each returns JSON with status, feed, connected, last_message_secs_ago)
kubectl port-forward -n ssmd deploy/kalshi-crypto-connector 8080:8080 &
sleep 2 && curl -s http://localhost:8080/health && kill %1

kubectl port-forward -n ssmd deploy/kraken-futures-connector 8081:8080 &
sleep 2 && curl -s http://localhost:8081/health && kill %1

kubectl port-forward -n ssmd deploy/polymarket-connector 8082:8080 &
sleep 2 && curl -s http://localhost:8082/health && kill %1

Archiver health uses the same pattern with archiver deployment names.

Prometheus metrics snapshot (also via port-forward):

kubectl port-forward -n ssmd deploy/kalshi-crypto-connector 8080:8080 &
sleep 2 && curl -s http://localhost:8080/metrics | grep -E 'websocket_connected|idle_seconds|messages_total' && kill %1

Infrastructure

NATS

NATS runs as a StatefulSet (nats-0). Use the nats-box pod for CLI commands:

# JetStream health
kubectl exec -n nats deploy/nats-box -- nats server check jetstream

# List streams (shows message counts, last message time)
kubectl exec -n nats deploy/nats-box -- nats stream ls

# Stream detail
kubectl exec -n nats deploy/nats-box -- nats stream info PROD_KALSHI_CRYPTO

# Consumer list for a stream
kubectl exec -n nats deploy/nats-box -- nats consumer ls PROD_KALSHI_CRYPTO

Streams: PROD_KALSHI_CRYPTO, PROD_KRAKEN_FUTURES, PROD_POLYMARKET, PROD_KALSHI_LIFECYCLE, SECMASTER_CDC, SIGNALS

data-ts (Postgres + API)

data-ts listens on port 8080 (not 3000). Health endpoint is /health (not /v1/health).

From allowed CIDRs (home network) — LoadBalancer direct access, no port-forward needed:

curl -s http://<LB-IP>:8080/health
# End-to-end API probe (requires datasets:read API key):
curl -s -H "Authorization: Bearer <API_KEY>" "http://<LB-IP>:8080/v1/markets/lookup?ids=KXBTCD-26FEB0317-T76999.99&feed=kalshi"

From elsewhere — via port-forward:

kubectl port-forward -n ssmd deploy/ssmd-data-ts 8083:8080 &
sleep 2 && curl -s http://localhost:8083/health && kill %1

Returns {"status":"ok"} when Postgres is connected.

Redis

kubectl exec -n ssmd deploy/ssmd-redis -- redis-cli ping

Operator

kubectl get deploy ssmd-operator -n ssmd
kubectl logs -n ssmd deploy/ssmd-operator --tail=50

Data Pipeline Health (CLI)

# Composite health report (writes to DB)
ssmd health daily

Cloud Monitoring Queries

gcloud monitoring time-series list \
  --project=massive-acrobat-227416 \
  --filter='metric.type="prometheus.googleapis.com/ssmd_connector_websocket_connected/gauge"' \
  --interval-start-time=$(date -u -v-1H +%Y-%m-%dT%H:%M:%SZ)

Replace metric filter for other metrics: ssmd_connector_messages_total/counter, ssmd_archiver_messages_total/counter.

DQ Checks

See ssmd-dq-run skill. Summary:

uv run data/dq.py --date YYYY-MM-DD --feed kalshi --stream crypto
uv run data/dq.py --date YYYY-MM-DD --feed kraken-futures --stream futures --prefix kraken-futures
uv run data/dq.py --date YYYY-MM-DD --feed polymarket --stream markets --prefix polymarket

CronJob at 03:30 UTC: kubectl create job --from=cronjob/ssmd-dq-daily ssmd-dq-manual-MMDD -n ssmd

Interpreting Results

DimensionGREENYELLOWRED
PodsAll Running, readyRestarts > 0CrashLoopBackOff
Connectorsconnected, idle < 60sidle 60-300sdisconnected or idle > 300s
ArchiversRunning, GCS sync recentsync 6-12h agosync > 12h
NATSJetStream OK, msgs flowingconsumer lag > 1000JetStream unhealthy
Postgresdata-ts /health OKslow queriesconnection refused
RedisPONG-error / timeout
DQ Score>= 9885-97< 85
Composite>= 8560-84< 60

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

78/100Analyzed 2/24/2026

Comprehensive health check procedure for an internal SSMD/Kubernetes environment. Provides actionable kubectl commands, port-forward patterns, and health endpoints for NATS, Postgres, Redis, and various connectors. Tightly coupled to internal infrastructure (specific project ID, namespace, deployment names) which limits reusability but makes it highly actionable for its target environment. Well-structured with clear sections and interpretation tables."

95
90
45
85
90

Metadata

Licenseunknown
Version-
Updated2/20/2026
Publisheraaronwald

Tags

apici-cddatabaseobservability