SONNET CODE
← Back to all articles
AI ToolsJune 30, 2026·7 min read

xAI Ships Grok Build /goal Autonomous Mode and Agent Dashboard

What shipped on June 24 and how it closes the four-vendor coding-agent race

On June 24, 2026 xAI shipped /goal inside Grok Build — a long-running autonomous mode that plans the work, executes against the plan, verifies its own output, and exposes status / pause / resume / clear controls for steering the run while it operates. The same release shipped the Agent Dashboard, a single-screen surface for managing many parallel Grok Build sessions at once, plus the grok-build-0.1 model on the xAI API in early access.

The operationally important shape is not the feature list. It is that the four-vendor coding-agent routing matrix — Cursor 3, Claude Code, OpenAI Codex, xAI Grok Build — has now converged on the same load-bearing capability profile in under six weeks:

  • Cursor 3 ships eight parallel agents inside isolated git worktrees with the Agent Client Protocol underneath the dispatch surface.
  • Claude Code runs dynamic-workflow subagent fan-out with the per-run ceiling lifted to 1,000 against Opus 4.8.
  • OpenAI Codex runs on Ona / Gitpod persistent cloud sandboxes that keep the agent working for hours or days after the developer closes the laptop.
  • Grok Build ships /goal long-running autonomous mode plus the Agent Dashboard for managing parallel sessions, all riding the Agent Client Protocol for cross-vendor portability.

The convergence reads three ways for the FY27 buyer:

  • The capability floor has lifted. A single coding-agent vendor no longer wins the routing-matrix slot on a single capability — every vendor at the frontier-tier coding-agent slot now ships parallel-session orchestration, long-running autonomous execution, an isolated execution substrate, and a dashboard surface for steering many runs at once. The per-vendor capability delta is now the quality of the four primitives, not the presence of any one of them.
  • The Agent Client Protocol is the load-bearing portability artifact. Three of the four vendors now ship Agent Client Protocol support (Cursor 3, Claude Code via the open spec, Grok Build at GA); the protocol is the substrate the buyer's standing contract leans on for the dual-vendor portability commitment the FY27 plan has to encode. The buyer that does not require ACP support inside the standing contract is the buyer that locks the per-team agent dispatch workflow to a single vendor's UI for the standing-contract term.
  • The procurement question is no longer 'which IDE feels best.' It is which two of the four anchor the routing matrix, what per-quarter portability commitment underwrites the dual-vendor shape, and what dispatch-and-review competence the senior-engineering function operates against all four primitives in parallel.

Three shifts in the AI-integrated product team's procurement decision

Three concrete shifts that follow when the four-vendor coding-agent matrix converges on the same load-bearing capability profile in the same quarter.

The routing-matrix decision becomes a two-vendor decision, not a one-vendor decision. The FY27 standing contract that anchors a single coding-agent vendor is the contract that under-prices the per-quarter capability-slip risk against the converged capability floor. The standing-vendor map for the coding-agent slot reads more honestly as a two-vendor anchor (a primary vendor the team standardizes the day-to-day routing matrix against, a second-source vendor the team grades against a per-quarter portability eval) plus an explicit ACP-conformant dispatch surface the team can swap one vendor out of without rewriting the production agent workflow. The team that writes the two-vendor shape into the standing contract this quarter is the team that owns the per-quarter capability-slip insurance the converged floor exposes.

The dispatch-and-review competence becomes the team's load-bearing engineering discipline. Eight parallel agents in Cursor, 1,000 subagents under Claude Code, persistent cloud sandboxes under Codex, /goal long-running runs under Grok Build — the per-vendor capability surface is now uniformly the team's ability to dispatch many agent runs in parallel and review the outputs at a cadence that keeps the production code surface healthy. The competence is not vendor-specific; it is the team-specific discipline of writing per-task verification contracts, maintaining a per-task review queue at the cadence the agent throughput demands, naming the orchestrator-coordinator role on the team, and grading the dispatch-and-review rate as a first-class engineering metric. The team that grows the competence inside the senior-engineering function is the team that converts the converged capability floor into measurable production-velocity translation; the team that does not is the team that runs four vendors in parallel without throughput translation on any one of them.

The Agent Dashboard surface becomes the standing engineering-rhythm surface. Grok Build's Agent Dashboard, Cursor 3's parallel-agent view, Claude Code's subagent run dashboard, and Codex's persistent-sandbox surface are all the same artifact: the single-screen surface where the senior-engineering function reviews the per-team agent-run portfolio at the standing engineering rhythm cadence. The team that operates against the dashboard surface inside the standing engineering review buys itself the per-team agent-run portfolio visibility the converged capability floor demands; the team that runs the agent dispatch outside the standing engineering rhythm is the team that re-discovers the per-team-overnight-review meeting two quarters into the rollout.

Where this lands in the AI-integrated product team's next sprint

The team that has been running a single coding-agent vendor against the routing matrix has three concrete pieces of work that drop into the sprint backlog this week.

Add a second-source vendor to the routing matrix against an ACP-conformant dispatch substrate. Pick the second vendor in the four-vendor matrix the team grades against the standing portability eval; require Agent Client Protocol support inside the standing contract; instrument the per-team agent dispatch workflow so a quarter's notice on a per-vendor capability slip lets the team route the dispatch surface to the second-source vendor without rewriting the production agent workflow. The artifact is the two-vendor anchor inside the standing contract, written portably enough to survive a per-vendor capability slip.

Name the orchestrator-coordinator role inside the senior-engineering function. The dispatch-and-review competence is a senior-engineering function with a named owner, a per-task verification-contract artifact, a per-team agent-run review queue, and a per-team-overnight-review meeting on the standing engineering rhythm. The team that names the role this quarter buys itself the operating model that converts the converged capability floor into measurable production-velocity translation; the team that does not is the team that re-discovers the meeting two quarters later under the standing post-mortem on the per-feature velocity regression.

Run the Agent Dashboard surface inside the standing engineering review. Move the per-team agent-run portfolio review from an ad-hoc cadence to the standing engineering rhythm — the Friday engineering review, the standing weekly review, whatever the team's standing rhythm is. The agent-run portfolio surfaces against the same review cadence the per-feature velocity surfaces against, and the senior-engineering function grades both against the same per-team engineering-capacity envelope. The team that operates against the standing review cadence is the team that catches the per-feature velocity regression the converged capability floor silently exposed.

The senior judgment the four-vendor convergence makes visible

The Grok Build /goal release is not the event; it is the marker that the four-vendor coding-agent routing matrix has converged on the same load-bearing capability profile inside a six-week window. The substrate is the per-team dispatch-and-review competence the senior-engineering function operates against all four primitives in parallel, and the standing-contract conversation is the per-quarter portability commitment against the Agent Client Protocol dispatch surface the FY27 plan has to encode.

The procurement question is no longer which coding-agent vendor's IDE feels best to the team; it is which two of the four converged-capability vendors anchor the routing matrix, what per-quarter portability commitment underwrites the dual-vendor shape against the standing contract, what dispatch-and-review competence the senior-engineering function operates against the per-team agent-run portfolio, what per-team-overnight-review cadence the standing engineering rhythm absorbs, and where the orchestrator-coordinator role lands inside the senior-engineering headcount plan the FY27 budget underwrites. The teams that ask the right question this quarter buy themselves the production-velocity translation the converged capability floor exposes; the teams that ask the wrong one buy themselves the per-feature velocity regression the converged floor silently underwrote.


At SONNET CODE we operate the dispatch-and-review competence for every AI-integration engagement we ship — the two-vendor routing matrix against the ACP dispatch surface, the per-team agent-run portfolio review inside the standing engineering rhythm, the orchestrator-coordinator role inside the senior-engineering function. If your team is re-grading the FY27 coding-agent routing matrix against the four-vendor convergence, schedule a call — we'll walk you through the per-team dispatch-and-review competence we run against the production code surface.