Sonnet Code
El Blog de Sonnet Code · Página 7

Apuntes de ingeniería desde el terreno.

Ensayos y notas de campo sobre IA, ingeniería de software, diseño y el oficio de construir equipos de producto que entregan. Escrito por los ingenieros que hacen el trabajo. Publicaciones en inglés.

AI Development10 min read

GitHub Copilot Shipped the Desktop App, Copilot SDK GA, and Sandboxes on June 2 — The Coding-Assistant-in-the-IDE Era Ended and the Agent-Native Developer Platform Era Began. Voice, Canvases, Cloud Sessions, Agentic Browsing, and Programmatic Access to the Agent Runtime All Landed Inside 24 Hours. The Engineering Org's Tooling Stack Just Inherited a New Top Row.

On June 2, 2026, GitHub shipped three changes that together end the coding-assistant-in-the-IDE era for the Copilot install base: the GitHub Copilot Desktop App went into expanded technical preview for every Copilot Pro, Pro+, Business, and Enterprise customer, with canvases, voice conversations, cloud sessions, agentic browsing, and tighter Copilot CLI integration as first-class surfaces; the Copilot SDK went generally available with a stable API, production support, and programmatic access to the same agent runtime that powers the GitHub Copilot product itself, including planning, tool invocation, file edits, streaming, and multi-turn sessions; and Copilot Sandboxes went live — secure, isolated local-and-cloud environments where Copilot can execute tools, modify files, and interact with the network under policy the customer defines. The deprecation of GPT-4.1 across all Copilot experiences landed the same day. The structural read isn't 'GitHub shipped another feature batch.' It's that the largest distribution surface for AI-assisted coding in the world — 1.8M+ paid seats and counting — just stopped framing itself as a coding assistant inside the IDE and started framing itself as an agent-native developer platform with the IDE as one of several entry surfaces. Here's what that does to the engineering-org tooling stack, the embedded-agent build-vs-buy decision for every product team that was considering rolling their own agent runtime, and the eval-and-governance discipline a regulated buyer needs to wire up before the platform's agentic surface becomes production reality at fleet scale.

Sonnet Code Editorial Team · 7 de junio de 2026
AI Development10 min read

Snowflake CoCo Went GA at Summit 26 on June 2 — Coding Agent + Datastream Land Inside the Customer's Data Perimeter, 7,100 Enterprise Customers Already Building, IDE + SDK + VS Code + Claude Code Plugins as Day-One Surfaces. The Boundary Between 'Data Platform' and 'Agentic Runtime' Just Stopped Existing for the Snowflake Install Base.

At Snowflake Summit 26 in San Francisco on June 2, Snowflake announced that Snowflake CoCo — formerly Cortex Code — has graduated into the company's agentic control plane for enterprise AI, with a native desktop IDE, an agent SDK, workflow extensions for VS Code and Claude Code, and a forthcoming mobile app and Slackbot. Snowflake Datastream, a fully managed Apache Kafka service, was unveiled alongside it as the streaming substrate that lets agents act on real-time enterprise data without leaving the governance plane. More than 7,100 Snowflake customers — Fanatics, Thomson Reuters, WHOOP among the named early adopters — are already building on CoCo. The structural read isn't 'another coding agent.' Every cloud vendor has shipped one in the last six months. It's that the largest pure-play data platform in the enterprise market just collapsed the boundary between 'the warehouse / lakehouse where the business's data lives' and 'the agentic runtime that operates on that data', with the data-governance posture, the access-control surface, and the audit trail unified by default rather than wired together by an integration team three quarters after the fact. The AI specialist firm that was being priced to build the 'connect the agent to the data, set up the auth surface, instrument the audit pipeline' integration just lost the easy half of that engagement to platform plumbing. Here's what that does to the make-vs-configure conversation for Snowflake's install base, the routing-portfolio shape for the BYO-model layer underneath CoCo, and the eval-and-governance discipline a regulated buyer needs to wire on top before the platform's promise becomes production reality.

Sonnet Code Editorial Team · 7 de junio de 2026
AI Development10 min read

OutSystems Unveiled the Agentic Systems Platform at ONE 2026 in Amsterdam — Low-Code's First Native Multi-Agent Runtime, With Claude Code, Codex, and Kiro as First-Class Coding Tools and the Enterprise Context Graph as the Governance Plane. The Boundary Between "Custom AI Engineering" and "Standard Configuration Work" Just Moved Inside the Low-Code Box.

At the OutSystems ONE Conference in Amsterdam in early June 2026, OutSystems announced the Agentic Systems Platform, powered by the OutSystems Enterprise Context Graph — a pivot from low-code application development to a full multi-agent platform with Agentic Systems Engineering, Agentic Enterprise Orchestration, and Agentic Industry Solutions as the three core domains. The platform supports Claude Code, Codex, and Kiro as first-class agentic coding tools, lets enterprises build, orchestrate, and govern agents on a single open unified platform, and ships with the Enterprise Context Graph as the governance, identity, and observability plane underneath. Early access opens in Q2 2026; agentic coding, publishing, and platform extensibility went live at ONE. The structural read isn't 'a low-code vendor pivoted to agents.' Every enterprise software vendor has shipped some variant of that announcement. It's that the largest low-code install base in the regulated-enterprise segment — a buyer profile that has historically held the line on governance, identity, audit, and lifecycle management — just made the agentic-AI stack a standard component of the low-code platform layer, with Claude Code and Codex sitting inside it as configured tools rather than externally-integrated specialist surfaces. The work that an AI specialist firm would have priced as a custom build for a regulated enterprise just moved one layer down into the platform the customer was already paying for. Here's what that does to the make-vs-configure conversation, the partner-versus-specialist procurement shape, and the eval-and-governance discipline a regulated buyer needs to wire up before the platform's promises become production reality.

Sonnet Code Editorial Team · 6 de junio de 2026
AI Training10 min read

Microsoft Frontier Tuning Shipped at Build 2026 — Reinforcement Learning Inside the Customer's Compliance Boundary, Trained on the Trace of Real Work, With One Internal HR Deployment Going From 13% to 87% Task Completion. The Enterprise Fine-Tuning Conversation Just Stopped Being About Static Datasets and Started Being About Workflows.

At Microsoft Build 2026 on June 2, Microsoft introduced Frontier Tuning — a new post-training and continuous-improvement system that applies reinforcement learning inside the customer's compliance boundary, using the trace of actual work (tool calls, decisions, corrections, outcomes) as the training signal rather than a static dataset of labeled examples. The system runs in a managed Reinforcement Learning Environment used both for post-training and inference, learns from real workflows without disrupting production systems, and produced an 87% successful task completion rate on an internal Microsoft HR deployment that started at 13%. Private preview is available through Forward Deployed Engineers, with broader availability coming in Microsoft Copilot Studio and Microsoft Foundry. The structural read isn't 'another fine-tuning offering.' Supervised fine-tuning on static datasets is what every cloud vendor has been selling since 2023, and the enterprises that have actually deployed it know how brittle the results are when the underlying workflow drifts. It's that Microsoft has packaged the reinforcement-learning loop — the same loop that produces the capability gains at the frontier labs — as a managed service that operates on the customer's own workflows, behind the customer's compliance boundary, with the customer's eval signals as the reward function. The post-training discipline that has lived only inside the frontier labs and a handful of well-funded enterprise ML teams just became a procurement line item. Here's what that does to the human-in-the-loop training data conversation, the eval-rubric discipline, and the shape of the senior-review queue when the model is learning continuously from the workflow rather than being shipped quarterly.

Sonnet Code Editorial Team · 6 de junio de 2026
AI Development10 min read

Microsoft Shipped MAI-Thinking-1 at Build 2026 on June 2 — Its First In-House Reasoning Model, Trained From Scratch on Commercially Licensed Data With Zero Distillation From OpenAI. 35B Active / ~1T Total MoE, 256K Context, 97% on AIME 2025, Matches Claude Opus 4.6 on SWE-Bench Pro. The Foundation-Model Independence Conversation Just Stopped Being Aspirational.

At Microsoft Build 2026 on June 2, Microsoft unveiled MAI-Thinking-1 — the company's first frontier-tier reasoning model, built from scratch on clean, commercially licensed data with no distillation from OpenAI's GPT family or any other third-party model. The architecture is a sparse Mixture-of-Experts with roughly 35 billion active parameters and a total parameter pool near one trillion, a 256,000-token context window, and a posture explicitly tuned for multi-step reasoning, math, science, and long-horizon software engineering. The headline numbers: 97.0% on AIME 2025, 94.5% on AIME 2026, parity with Claude Opus 4.6 on SWE-Bench Pro, and preference over Claude Sonnet 4.6 in blind side-by-side evaluations run by Surge. The model is in private preview through Microsoft Foundry, sits alongside MAI-Code-1-Flash and five other MAI-family models, and signals — at the platform level — that the dependency-on-OpenAI era of Microsoft's enterprise AI offering is materially over. The structural read isn't 'Microsoft shipped another model.' It's that the single largest enterprise software vendor in the world now owns the entire training stack of a frontier-tier reasoning model — data sourcing, post-training, eval, and inference — without a license to a frontier lab gating any layer of the dependency graph. Here's what that does to the multi-vendor routing strategy, the data-sovereignty conversation, and the procurement posture of any enterprise that was banking on Microsoft as its primary AI provider but quietly hedging because the underlying model was somebody else's.

Sonnet Code Editorial Team · 6 de junio de 2026
AI Development10 min read

SAP Joule Studio 2.0 Rolls Out in June — LangGraph, AutoGen, and LlamaIndex Are Now First-Class Agent Frameworks Against Live SAP Data, the Agent Hub Becomes the Enterprise Governance Plane, and the "Agentic SDLC" Just Crossed Into Mainstream ERP. The Boundary Between Custom AI Engineering and Enterprise Business Software Just Got a Lot Blurrier.

SAP's Joule Studio 2.0, announced at Sapphire 2026 and rolling out through June, makes LangGraph, AutoGen, and LlamaIndex first-class agent frameworks against live SAP business data — with the new Autonomous Suite shipping 50+ domain Joule Assistants and 200+ specialized agents across finance, supply chain, procurement, HCM, and CX, and the SAP AI Agent Hub serving as the cross-enterprise governance, observability, and lifecycle-management plane for every agent regardless of who built it. Developers work inside VS Code and the MCP-enabled toolchain they already use, choose their preferred framework, and the platform layers SAP's business context, managed runtime, and enterprise governance underneath. The structural read isn't 'SAP shipped agent tooling.' Every major ERP vendor has shipped some version of that announcement in 2026. It's that the largest enterprise-software install base in the world just made the agentic-SDLC stack — the same LangGraph and AutoGen and LlamaIndex the AI engineering community has been building on for two years — a *standard component of the ERP layer*. The boundary between 'the custom AI work we hire a specialist firm to do' and 'the standard configuration work our SAP team does' just moved, and the team that walks into the next ERP-modernization budget cycle with the right framing of what falls on each side of the boundary wins the budget conversation. Here's what shifts when the agent frameworks are inside the ERP, and what to set up before the procurement team starts asking whether the AI specialist firm is even necessary.

Sonnet Code Editorial Team · 5 de junio de 2026