Essays and field notes on AI, software engineering, design, and the craft of building product teams that ship. Written by the engineers doing the work.

Surge AI generates roughly $1.2B annually selling expert humans who write the data that trains frontier models — and just opened its first capital raise at a reported $25B valuation. Read against Anthropic's $30B run rate, the message is unambiguous: the human-in-the-loop layer is not commoditizing, and the premium isn't going away.

An open-source scaffold from OpenAutoCoder hit 79.2% on SWE-bench Verified paired with Claude Opus 4.5 — 1.7 points behind Anthropic's own internal harness, and the leading open-source result on SWE-Bench Pro. The model layer is no longer where the alpha lives. The scaffold is.

Adobe says 31% of enterprises have a measurement framework for agentic AI. LangChain says 57% have agents in production and 32% cite quality as the top deployment barrier. The math doesn't add up — and the gap is the most expensive line item on the next eighteen months of AI roadmaps.

DeepSeek V4 Pro matches GPT-5.5 and Claude Opus 4.7 on most agentic benchmarks at 10–13x lower output cost, ships a 1M-token context, and lands under a standard MIT license. The open-weight tier just stopped being a fallback — and the routing playbook needs a fourth row.

Apple shipped agentic coding in Xcode 26.3: Claude Agent and OpenAI Codex run native, MCP is the open hook for everything else. iOS development just inherited the agentic stack the web has been quietly building for two years — and the strategic move is the MCP surface, not the partner logos.

On April 29, Mistral shipped Medium 3.5 — 128B dense, 256k context, 77.6% SWE-Bench Verified, modified MIT license — alongside Vibe remote agents that run coding sessions in the cloud. For the first time, the open-weight option in agentic coding is close enough to the closed frontier to change the routing playbook. Here's what to do about it this month.