Await Polygraph CI
Wait for all CI pipelines in a Polygraph session to reach a stable state (succeeded, failed, etc.), then produce a unified summary. If any pipelines failed, investigate via child agents and present fix options.
Phase 1: Session Discovery
- Get the current branch name: !
git branch --show-current - Use the branch name as the session ID. If on
main,master, ordev, ask the user for an explicit session ID. - Fetch session:
cloud_polygraph_get_session(sessionId: <session-id>) - Record
monitorStartedAt= current timestamp (epoch millis). - Build a tracking table of all repos with PRs. For each PR, record:
repo: repository nameprUrl: PR URLprStatus: DRAFT / OPEN / MERGED / CLOSEDciStatus: from session (may already be a terminal status from a previous run)cipeUrl: CI pipeline URL (null if none)cipeCompletedAt:completedAtfrom session (epoch millis, null if CIPE is active or absent)selfHealingStatus: self-healing fix status (null if none)firstSeenAt: current timestamp
- If no PRs found, report "No PRs in session" and exit.
- Stale detection: For each PR, determine if its CI status is stale — meaning it reflects a previous run, not a current one. A PR's CI status is stale if:
cipeCompletedAtis non-null ANDcipeCompletedAt < monitorStartedAt(the CIPE finished before the monitor started)- Mark these PRs as
stale: true
- Display the initial status table, annotating stale PRs:
backend: SUCCEEDED (stale) | frontend: SUCCEEDED (stale) | shared-lib: NOT_STARTED
Phase 2: Polling Loop
Configuration:
- Timeout: 30 minutes total
- Backoff: 60s → 90s → 120s (cap)
- Circuit breaker: exit after 5 consecutive polls with no status change
Each poll iteration:
- Call
cloud_polygraph_get_session(sessionId: <session-id>) - Update each tracked PR from the session response:
ciStatus,cipeUrl,cipeCompletedAt, andselfHealingStatus - Clear stale flag: If a PR was marked
stale: trueand itscipeCompletedAthas changed (or become null, meaning a new CIPE is active), clear the stale flag — this PR now has fresh CI data. - Display status update:
Include[await-polygraph-ci] Poll #N | Elapsed: Xm | Repos: Y total, Z completed backend: SUCCEEDED | frontend: FAILED (self-healing: PENDING) | shared-lib: SUCCEEDED (stale)selfHealingStatusinline when non-null. Annotate stale PRs. - Check exclusion rule: if a PR has
prStatus: DRAFTandciStatus: NOT_STARTEDfor more than 5 minutes sincefirstSeenAt, mark it asEXCLUDED(DRAFT PRs may not trigger CI) - Check terminal conditions — a PR is terminal when:
- It is NOT stale, AND:
- CI status is
SUCCEEDED,CANCELED, orTIMED_OUT, OR - CI status is
FAILEDAND there is no active self-healing (i.e.,selfHealingStatusis null or a final state likeAPPLIED,REJECTED,FAILED)
- CI status is
- A
FAILEDPR withselfHealingStatusindicating an in-progress fix (e.g.,PENDING,IN_PROGRESS) is NOT terminal — keep polling to track the self-healing outcome - A stale PR is NOT terminal — keep polling until it gets a fresh CIPE or is excluded
- It is NOT stale, AND:
- Stale timeout: If a stale PR remains stale for more than 5 minutes, assume no new CI is expected for it. Clear the stale flag and treat its current status as final.
- If all non-excluded PRs are terminal → proceed to Phase 3
- If timeout or circuit breaker hit → proceed to Phase 3 with partial results
- Otherwise → wait with backoff, then poll again
Phase 3: Results Analysis
Categorize repos into: succeeded, failed, canceled, timed_out, excluded, in_progress (if timed out).
Display final summary table. When showing self-healing status, distinguish clearly between these states:
COMPLETED= a fix was generated and verified, but NOT yet applied. Display asfix available.APPLIED= the fix was applied by the user or agent. Display asfix applied, awaiting re-run.IN_PROGRESS/PENDING= the fix is still being generated. Display asin progress.REJECTED= the fix was rejected. Display asfix rejected.FAILED= self-healing failed to produce a fix. Display asfix failed.
[await-polygraph-ci] Final Results | Elapsed: Xm
SUCCEEDED: backend, shared-lib
FAILED: frontend (self-healing: fix available)
EXCLUDED: docs (DRAFT, no CI)
Include self-healing status for any repo that has one.
- If all succeeded → report success and exit
- If any failed with
selfHealingStatus: APPLIED, inform the user that the fix was applied and a CI re-run may be in progress or needed - If any failed with
selfHealingStatus: COMPLETED, inform the user that a fix is available but not yet applied, and offer to apply it - If any failed → proceed to Phase 4
Phase 4: Failure Investigation (Child Agent Delegation)
For each repo with ciStatus: FAILED:
-
Display known info from the session data before delegating:
Repository: frontend CI Pipeline: <cipeUrl from session> Self-healing: <selfHealingStatus from session, or "None"> Investigating failure details... -
Delegate investigation (non-blocking) — call
cloud_polygraph_delegatefor each failed repo:sessionId: the session IDtarget: the repository nameinstruction: Use theci_informationMCP tool to investigate the CI failure on this branch. Return a structured summary with: (1) list of failed task IDs with a one-line error summary each, (2) failure category (Build / Test / Lint / E2E / Infra / Other).context: Polygraph session monitoring — investigating CI failure for unified summary.
Since
cloud_polygraph_delegateis non-blocking, you can delegate to multiple failed repos in parallel. -
Monitor investigation progress — poll
cloud_polygraph_child_statusto wait for each child agent to complete:cloud_polygraph_child_status(sessionId: "<session-id>", target: "frontend")Poll until the child agent's status indicates completion. Use the
tailparameter to retrieve recent output lines containing the investigation results. -
Collect each child agent's response from the status output. If a child agent fails or gets stuck, use
cloud_polygraph_stop_childto terminate it and skip that repo. -
Display failure summary for each repo:
Repository: frontend CI Pipeline: <cipeUrl> Failed Tasks (2): - frontend:build → TypeScript error in src/app.tsx:42 - frontend:test → 3 test suites failed Category: Build + Test failures Self-healing: <selfHealingStatus>
Phase 5: Fix Planning
- Group failures by category (Build, Test, Lint, E2E, Infra)
- Identify cross-repo dependency issues (e.g., shared-lib build failure blocking frontend)
- Suggest fix order based on dependency graph (upstream repos first)
- Present next actions to the user based on self-healing status:
- If any repo has
selfHealingStatuswith an available fix → offer to apply self-healing viaupdate_self_healing_fix(action: "APPLY")or reject it - If self-healing was already applied → offer to resume monitoring to watch the re-triggered CI
- Delegate fixes: use Polygraph to send fix instructions to child agents (for repos without self-healing or where self-healing was rejected/failed)
- Get more details: drill into a specific repo's failure
- Exit: done monitoring
- If any repo has
Notes
- This skill does NOT push code directly. The only write action it may take is applying/rejecting a self-healing fix via
update_self_healing_fix, which is an Nx Cloud operation (not a local code change). - Both
ci_informationandupdate_self_healing_fixresponses include ahintsarray with contextual guidance (e.g., disclaimers about which CI Attempt was retrieved). Always check and surface non-empty hints. - All heavy CI data inspection happens in child agents via
cloud_polygraph_delegateto keep this context window clean. cloud_polygraph_delegateis non-blocking — it starts the child agent and returns immediately. Usecloud_polygraph_child_statusto poll for results andcloud_polygraph_stop_childto terminate stuck agents.- The
cloud_polygraph_get_sessionresponse is compact and safe to poll from the main agent.
