Ensayos y notas de campo sobre IA, ingeniería de software, diseño y el oficio de construir equipos de producto que entregan. Escrito por los ingenieros que hacen el trabajo. Publicaciones en inglés.

The Agent Client Protocol (ACP) — JSON-RPC 2.0 over stdin/stdout, Apache 2.0 license, designed by Zed and JetBrains as the editor-agent interoperability standard that LSP was for language tooling — crossed the maturity threshold this quarter. JetBrains brought ACP support to the entire IntelliJ family. Zed and JetBrains co-launched the ACP Registry, a directory of compatible agents that auto-updates across every ACP-compatible client. The Devin Desktop relaunch on June 2 shipped with ACP as the default surface, supporting Codex CLI, Claude Agent, OpenCode, Gemini CLI, GitHub Copilot CLI, and any in-house agent the customer's team builds. More than 25 agents are now in the registry; the protocol is shipping in production at platforms that together cover most of the IDE install base. The structural read isn't 'a new protocol shipped.' It's that the editor-agent interoperability boundary that every coding-agent vendor was treating as a defensive moat just got standardized — which means the customer who designs their AI coding stack around ACP gets editor-level portability for free, the closed-IDE vendors lose the lock-in they were quietly counting on, and the new procurement conversation is about agent runtime capabilities and model economics rather than which editor's plugin the engineering org commits to. Here's what that does to the IDE choice, the in-house agent build-vs-buy decision, and the multi-vendor routing strategy at the editor layer specifically.

OpenCode launched in mid-2025 on a deliberately contrarian thesis — terminal-native, model-agnostic against 75+ providers, customer code never leaves the customer's environment, MIT licensed end to end — and twelve months later it is sitting at ~161,000 GitHub stars, ~7.5M monthly active developers, and a place in the top row of every honest 2026 AI coding survey alongside Claude Code, Cursor, and Codex. The desktop app and the VS Code / Cursor extensions shipped alongside the CLI; ACP support landed natively; the entire stack runs Apache-and-MIT under the hood with zero retention on customer code by design. The structural read isn't 'an open-source tool got popular.' It's that the open-source agent tier just established itself as a credible production option for regulated buyers — and that, with portability newly legible across MCP, ACP, and the model-agnostic provider matrix, the closed-source platforms are now competing on integration depth and per-token economics rather than on whether they are the only viable answer at all. Here's what that does to the multi-vendor routing portfolio, the air-gapped deployment story for regulated buyers, and the build-vs-buy boundary the next two procurement cycles will redraw.

The data labeling and human-feedback industry is in the middle of a structural rerating. Surge AI — founded in 2020, profitable from launch, zero VC money for five years — hit $1.2B in annualized revenue with the frontier-lab cohort (OpenAI, Google, Anthropic, Microsoft, Meta) as its dominant customer base, and started its first capital raise at a $15B-$25B valuation in mid-2025. Scale AI carries a $14B valuation on top of an integrated platform from RLHF workflows through secure-enclave deployment. The RLHF platform market itself is forecast to grow from $2.8B in 2025 to $18.6B by 2034, while the AI data-labeling market grows from $2.3B in 2026 to $6.5B by 2031 at 22.95% CAGR. The iMerit State of AI in the Enterprise study lands the operating number that matters: 96% of companies say human-in-the-loop is essential or nice-to-have for AI/ML projects, and 86% say it is strictly essential. The standard 2026 training pipeline at every major lab is Pre-training to SFT to Preference Optimization (RLHF/DPO) to RLVR (for reasoning), and DPO has emerged as the de-facto default for alignment fine-tuning — but the demand for skilled reviewers has moved up, not down, because the bottleneck on every step of that pipeline is now domain-deep human judgment rather than annotation volume. The structural read isn't 'data labeling is back.' It's that the discipline of designing the human-judgment surface — who reviews, what rubrics, which gold sets, what calibration cadence — just became the durable engineering moat of AI quality, separable from any single frontier lab's model lead. Here's what that does to the AI-training service shape, the buyer's eval-and-governance discipline, and the make-vs-buy decision for every enterprise about to spend FY27 budget on alignment work.

Anthropic released Claude Opus 4.8 on May 28, 2026 across claude.ai, the Claude API, Claude Code, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry. Headline performance: agentic coding 64.3% to 69.2%, multidisciplinary reasoning with tools 54.7% to 57.9%, agentic computer use 82.8% to 83.4%, and the only frontier model to complete every case end-to-end on the Super-Agent benchmark at parity-on-cost with GPT-5.5. The behavioral framing is 'sharper judgement, more honesty about progress, and the ability to work independently for longer.' The structurally important launch is what shipped alongside the model: effort controls on claude.ai that let users dial how hard Claude works on a task, Fast mode (2.5x speed) at 3x cheaper than the prior generation's Fast tier, and Dynamic Workflows in Claude Code as a research preview — a single orchestrator session that spawns hundreds of parallel subagents, each with its own context window, and aggregates results into a single coherent output. That is the orchestrator-workers pattern, the highest-leverage pattern in the agentic-design literature, shipped as a first-class Claude Code primitive rather than a multi-quarter platform-engineering project the customer pays a specialist firm to build. The structural read isn't 'Claude got better.' It's that the agentic-pattern-as-platform-primitive era just officially started for the highest-spend cohort of AI coding customers. Here's what that does to the embedded-agent build-vs-buy decision, the routing portfolio, and the eval-and-governance discipline a regulated buyer needs to wire on top before the orchestrator's promise becomes production reality.

Cursor's run from $4M to $2B in annualized revenue in 18 months is now public, enterprise accounts for roughly 60% of that revenue, ~70% of the Fortune 1,000 is represented in the customer base, and the company hired a former Rubrik President/CRO in February to build the enterprise sales motion. Composer 2.5 shipped in May 2026 with targeted reinforcement learning against long-horizon coding tasks and complex instruction-following — scoring 62 on the Artificial Analysis Coding Agent Index, third behind Claude Opus 4.7 in Claude Code (66) and GPT-5.5 in Codex (65). The Teams pricing surface restructured into Standard seats ($32/seat annual, $40 monthly) and a new Premium tier ($96/seat annual) at 5x Standard usage. The structural read isn't 'Cursor is winning.' Everybody who watches the space already knew that. It's that the AI coding category is now a market in late-cycle consolidation around a small handful of platform vendors, and the buyer's procurement conversation just stopped being about which tool the engineering org uses day-to-day and started being about which vendor's platform roadmap your engineering culture is going to inherit through the next budget cycle. Here's what that does to the multi-vendor routing strategy, the make-vs-configure boundary on embedded agentic capability, and the eval discipline a regulated buyer needs before signing the multi-year platform contract that comes next.

Gartner published the 2026 Magic Quadrant for Enterprise AI Coding Agents on May 20 and circulated the supplementary report 'Leading in Enterprise AI Coding Agents Requires More than Product Momentum' in the weeks since, with the consolidated read landing this week as engineering leaders synthesize it for FY27 planning. The category is now estimated at $9.8B–$11.0B annualized as of April 2026 and growing into a second phase of competitive realignment: frontier model providers (Anthropic, OpenAI, Google) are moving up the stack into the application layer with native agentic coding products, application-layer vendors (GitHub, Snowflake, ServiceNow, OutSystems) are integrating frontier models more deeply and shipping native agent runtimes, and a meaningful second and third tier — Cursor, Replit, Windsurf, Codeium, Cognition, Grok and the open-source agent-runtime cohort — is contributing real revenue particularly in enterprise deployments. The headline productivity number that matters: 90% of engineering leaders report improvements with a net average gain of 19.3%, but the distribution is sharply asymmetric — the teams with mature eval discipline, calibrated senior-review queues, and FinOps attribution at the agent-action granularity are capturing meaningfully more than the average, and the teams without those primitives are capturing meaningfully less, sometimes nothing, sometimes negative. The structural read isn't 'AI coding agents work and you should adopt them.' Everyone already knows that. It's that the gap between teams that capture the gains and teams that don't is now the dominant procurement and platform-engineering question, and Gartner's own framing — by 2027, over 65% of engineering teams using agentic coding will treat IDEs as optional, shifting control, governance, and validation to automated platforms — implies that the platform-engineering investments that turn the average 19.3% lift into the top-quartile lift have to ship inside the next three quarters or the competitive position is set for the cycle.