Sonnet Code
← Volver a todos los artículos
AI Development19 de junio de 2026·10 min read

Salesforce Brought Agentforce Multi-Agent Orchestration to GA on June 15 — Atlas Reasoning Engine 3.0 Routes Work by Agent Description Instead of Decision Trees, the Agent2Agent Protocol and MCP Ship in the Same Release, Agentforce Sits at $800M ARR Up 169% YoY With 29,000 Deals Closed and 2.4 Billion Agentic Work Units Logged — The Enterprise CRM Default Just Acquired the Orchestrator-and-Specialists Topology, and Every Team Embedded in a Salesforce-Centric Stack Has a Q3 Integration Decision the Procurement Spreadsheet Doesn't Yet Reflect.

What Salesforce shipped on June 15 and the topology that lands with it

On June 15, 2026, Salesforce brought Multi-Agent Orchestration in Agentforce to general availability — the centerpiece of the Summer '26 release and the moment the enterprise CRM default formally moved from single chatbots wired into the workflow to coordinated teams of specialist agents routed by an orchestrator. The release rolled out in waves from June 13 and is production-eligible under API v67.0. Multi-Agent Orchestration had been in beta through the Spring '26 cycle; the June 15 date is what makes the topology supported for production workloads instead of preview-only experimentation.

The operationally important pieces:

  • The orchestrator-and-specialists pattern is now the default shape, not the edge case. A single orchestrator agent receives the inbound request, inspects which specialist subagents are registered, reads each specialist's description and available actions, and routes the work to whichever specialist is best suited. The pattern is the same shape every serious multi-agent framework has been converging on for eighteen months; what changed on June 15 is that the CRM the enterprise's customer-facing automation already lives inside ships it out of the box, with the orchestration metadata managed in the same admin surface as the rest of the org configuration.
  • Atlas Reasoning Engine 3.0 routes by agent description, not by fixed decision trees. The Atlas refresh is the load-bearing piece. The routing engine reads each specialist agent's description — written in natural language by the team that built the agent — and uses the description as the routing signal. The implication for the team that owns the agent inventory: the description is now production code, with the same review discipline, the same change-control surface, and the same regression-risk profile as any other component the orchestrator depends on. The team that treats agent descriptions as a documentation field ships a routing surface that drifts; the team that treats them as the routing API ships a topology that holds up under audit.
  • The Agent2Agent protocol and MCP ship in the same release. A2A is the cross-agent communication surface that lets agents from different platforms collaborate; MCP is the tool-and-context surface that lets agents reach into external systems. Bundling both in the same release is the directional commitment: Salesforce is not pretending the production AI architecture will be Salesforce-only. The buyer's read should be the same — the agents inside Agentforce will talk to agents and tools outside Agentforce, and the integration discipline lives on the buyer's side of the contract.
  • The commercial signal underneath is large and accelerating. Agentforce ARR reached $800 million, up 169% year over year. Combined AI revenue across the Salesforce surface surpassed $2.9 billion. The company closed 29,000 Agentforce deals in the last year, up 50% quarter-on-quarter in Q4 alone, and logged 2.4 billion agentic work units across Agentforce and Slack. The numbers are not a forecast about whether the enterprise will adopt agents; the enterprise has adopted them, and the install base the next 12 months will refactor against is already in production.

The structural read isn't Salesforce added a multi-agent feature. It's that the default shape of enterprise customer-facing AI automation — the shape the procurement spreadsheet, the implementation playbook, and the integration architecture have all been written against for the prior generation — just moved from single-purpose chatbot inside a CRM workflow to orchestrator-plus-specialists wired into the CRM's runtime. The teams whose architecture diagrams still show the single-agent topology are operating against a diagram the platform has already moved past.

What the orchestrator default restructures about enterprise AI architecture

Four concrete shifts that follow when the CRM ships the orchestrator-and-specialists pattern as the default.

The agent inventory becomes a first-class engineering surface. Twelve months ago, the Agentforce agent was a single noun the admin team configured. Today, the org has an orchestrator and a portfolio of specialists, each with its own description, its own action surface, its own failure-mode tail, and its own routing-grade signal. The team that catalogs the inventory the same way it catalogs production microservices — versioned descriptions, change-history per agent, per-agent eval gold sets, per-agent senior-review queue — gets an inventory that survives the next platform release; the team that lets the inventory accumulate organically gets a routing surface no one understands six months later. The catalog discipline is the engineering work the orchestrator default forces on the buyer.

The orchestrator's description-based routing makes the agent's description the new attack surface. A natural-language description is a much richer routing signal than a hard-coded decision tree, and it is also much more sensitive to wording, framing, and inadvertent overlap between specialists. The team whose two specialists have descriptions that both plausibly cover the cancel customer order path ships an orchestrator that routes the work to whichever specialist's description happens to score higher on the day's prompt — a non-deterministic failure mode the team will see in production before they see it in QA. The procurement read: the team needs a description-review discipline, a per-specialist coverage matrix, and a routing-decision dashboard that grades the orchestrator's choices against the team's own gold set. The platform does not ship those; the buyer has to.

The multi-vendor integration story moves to the foreground. The bundled A2A and MCP commitments say the platform will let the Agentforce orchestrator talk to non-Salesforce agents and reach into non-Salesforce tools. The buyer's read: the production AI architecture is not Salesforce on the customer-facing edge, vendor-X on the engineering edge, vendor-Y on the back-office edge, all isolated; the production architecture is cross-vendor agent traffic, cross-vendor tool calls, cross-vendor context flowing through the orchestrator. The team that treats A2A and MCP as ship-day features the integration team will turn on later is the team that discovers the cross-vendor audit trail, the cross-vendor failure-mode tail, and the cross-vendor data-handling questions in production. The team that designs the integration architecture against the multi-vendor reality the platform now formally supports gets the topology that compounds.

The implementation playbook needs the senior-review queue per specialist, not per platform. A single Agentforce chatbot had a single senior-review surface — the QA cohort that graded the chatbot's outputs before promotion. An orchestrator with a portfolio of specialists has a senior-review surface per specialist, and the per-specialist failure modes are different — the cancel-order specialist fails differently than the upgrade-recommendation specialist, which fails differently than the case-deflection specialist. The team that runs a single rubric across every specialist catches the failure modes of the median agent and misses the failure modes of the long tail; the team that runs per-specialist rubrics, calibrated per-specialist quarterly, catches the long tail before the long tail catches the customer.

Where the GA is signal and where it is noise

Four honest reads on what the June 15 milestone actually tells the buyer.

Signal: the multi-agent topology is now the enterprise default. The combination of Salesforce shipping it as the Summer '26 centerpiece, the $800M ARR, the 29,000 deals closed, and the 2.4 billion agentic work units logged is consistent with what every adjacent platform — ServiceNow, IBM watsonx Orchestrate, Microsoft Agent 365 — has been signaling for two quarters. The orchestrator-and-specialists pattern is the working pattern of the enterprise install base. The framing of multi-agent as a leading-edge experiment is no longer the working framing.

Signal: the bundled A2A and MCP commitments are the right architectural posture. Bundling them with Multi-Agent Orchestration at GA — rather than positioning them as optional add-ons gated behind a higher tier — says the platform's product team has internalized that the production AI architecture is multi-vendor by design. The buyer who reads the bundle as a directional commitment that future Agentforce releases will continue to invest in cross-vendor interoperability is reading the right signal.

Noise: the GA milestone does not eliminate the implementation work. Multi-Agent Orchestration shipping in the platform is not the same as Multi-Agent Orchestration shipping in the customer's org. The agent-inventory catalog, the per-specialist eval gold sets, the description-review discipline, the routing-decision dashboard, and the per-specialist senior-review queue are all the buyer's work. The platform shipped the topology; the platform did not ship the discipline the topology requires.

Noise: the headline ARR number is not the customer-side adoption signal. $800M ARR up 169% YoY is the platform vendor's revenue, not the buyer's deployed-agent count. The procurement question that gets the deployment-side signal is the team's own internal question — which Agentforce specialists are in production today, how many escalations does each handle weekly, what is the human-handoff rate per specialist, what is the routing-accuracy rate of the orchestrator — and the platform's headline number does not substitute for the team's own dashboard.

What the team should do inside the first 90 days

Four concrete actions that close the gap between the GA milestone and the production discipline the topology requires.

Stand up the agent-inventory catalog as a versioned engineering artifact. Each Agentforce specialist gets a row: the description (in the version the orchestrator sees), the action surface, the change-history, the owner, the eval gold-set link, and the senior-review-queue load. The catalog lives in source control or in a wiki the engineering team treats as source-of-truth. The discipline is identical to the discipline the team gives any microservice catalog; the platform-team-side benefit shows up the first time a routing-decision regression has to be traced back to a description change.

Author the per-specialist eval gold sets against the team's own customer-traffic distribution. A specialist's success rate against an aggregate benchmark is the wrong signal; the right signal is the specialist's success rate against the specific intent distribution the team's customers actually generate. The work is the gold-set authoring per specialist, the per-orchestrator-routing-decision grading against those gold sets, and the rubric refresh quarterly so the eval surface tracks the customer-intent drift.

Wire the routing-decision dashboard into the engineering review cadence. The dashboard surfaces, per week: which specialists the orchestrator routed to, what the routing-confidence distribution looks like, which intent classes the orchestrator over-routed or under-routed, which specialists hit their human-handoff threshold, and which descriptions might need refresh based on the routing-accuracy signal. The cadence keeps the discipline alive after the GA-week excitement wears off.

Calibrate the per-specialist senior-review queue against the multi-agent failure-mode tail. Each specialist's confident-and-wrong failure mode is different; the senior-review queue calibrated to catch the cancel-order specialist's failure modes is calibrated against a different failure shape than the queue catching the upgrade-recommendation specialist's. The rubric authoring per specialist is the senior-judgment workload the topology imposes on the team; the calibration cadence keeps the queue from drifting against the agent's behavior.

What this does not change

Three honest caveats.

It does not eliminate the multi-vendor architectural reality. Agentforce shipping the orchestrator pattern inside the CRM does not collapse the buyer's other-platform agent decisions — the engineering-side orchestration in Microsoft Agent 365, the IT-operations agent in ServiceNow, the data-side agent in Snowflake Cortex, the back-office agent in SAP Joule. The bundled A2A and MCP commitments make the cross-platform integration plausible; they do not make it free. The integration architecture is still the buyer's engineering work.

It does not eliminate the description-attack surface. A natural-language description is a routing signal that depends on the wording, the framing, and the implicit overlap between specialists. The team that does not run a description-review discipline ships a routing surface with non-deterministic failure modes the platform's vendor cannot diagnose for the team. The discipline is the team's work.

It does not eliminate the senior-judgment workload behind every specialist. Each specialist's failure-mode tail — the confident wrong answers the orchestrator routed correctly but the specialist mis-handled — is the cost the team pays for the routing decision. The senior-review queue calibrated per specialist is the engineering and human-judgment workload the orchestrator topology imposes on the buyer.

Where Sonnet Code fits

Multi-Agent Orchestration in Agentforce is the right enterprise default for the customer-facing AI surface. The platform-team-side benefit shows up the first time the team has to defend the topology against a routing-accuracy regression, a description-attack incident, or a cross-vendor integration failure mode — and the work that closes those gaps is engineering and senior-judgment work that does not live in the platform.

AI development at Sonnet Code is the engineering half: standing up the agent-inventory catalog as a versioned engineering artifact against the team's existing Agentforce, Slack, and external-tool surfaces; designing the description-review discipline and the per-specialist coverage matrix that keeps the orchestrator's routing surface deterministic enough to operate; integrating the routing-decision dashboard into the engineering review cadence with per-specialist routing-accuracy, human-handoff, and failure-mode-tail attribution; and wiring the cross-vendor A2A and MCP surfaces against the team's Microsoft, ServiceNow, Snowflake, and SAP integration points so the production AI architecture matches the multi-vendor reality the platform now formally supports.

AI training at Sonnet Code is the human-judgment half: senior engineers and domain experts who author the per-specialist gold sets that grade each Agentforce specialist honestly against the team's specific customer-intent distribution; design the per-specialist senior-judgment rubrics that calibrate the senior-review queue for the multi-agent failure-mode tail; refresh the gold sets and rubrics quarterly so the routing decisions and the specialist behaviors do not silently drift as the intent distribution evolves; and serve as the senior-judge pool whose calibrated decisions feed the description-refresh and rubric-refresh updates the next routing-accuracy cycle resolves against.

The enterprise CRM default just acquired the multi-agent topology. The teams that walk into Q3 with the agent-inventory catalog stood up, the per-specialist eval gold sets authored against the team's own customer-intent distribution, the routing-decision dashboard wired into the engineering review cadence, and the per-specialist senior-review queue calibrated against the multi-agent failure-mode tail are the teams that turn the GA milestone into a compounding customer-experience advantage. The teams that treat the GA as a feature-flag-flip and stop there will discover the description-attack surface, the routing-accuracy drift, and the cross-vendor integration debt — six months after the buyer down the road figured out how to grade the topology honestly.