Ensayos y notas de campo sobre IA, ingeniería de software, diseño y el oficio de construir equipos de producto que entregan. Escrito por los ingenieros que hacen el trabajo. Publicaciones en inglés.

On May 8, 2026 Snyk announced it had embedded Anthropic's Claude models inside its AI Security Platform, exposed to Claude Code and competing agentic editors over the Model Context Protocol. Vulnerability discovery, prioritization, and auto-fix now run as MCP tool calls — the model writes code, the MCP tool scans it, the model patches it, in the same loop. Snyk also shipped Evo by Snyk, a governance layer that continuously discovers AI assets across the org (models, agents, MCP servers, datasets, third-party tools) and red-teams running agents for prompt injection and data exfiltration. The pitch is clean: AI generates code faster than human review can keep up, so make security a tool the model calls. The reality is more interesting. MCP-native security doesn't eliminate the review tax on AI-generated code — it restructures it. The surface moves from 'humans staring at every diff' to 'humans designing the policy that decides which findings auto-fix, which queue, and which escalate.' That design work isn't a Snyk product. It's the layer above the integration — and whether your team builds it well determines whether the new tool surface is a security win or a faster way to ship signed-off vulnerabilities.

On May 28, 2026 Anthropic closed a $65B Series H at a $965B post-money valuation — eclipsing OpenAI's $852B mark and making it the most valuable private AI company in the world. The number that matters more than the valuation is the one two paragraphs into the announcement: $47B annualized run-rate revenue, up from $14B at the Series G in February. Three-and-a-third times in three months — not pilots, not hype, procurement throughput on contracts that have already been decided. Anthropic also pre-announced Claude Mythos — a model 'a full capability tier above Opus 4.7' — for wide release 'in the coming weeks,' and named HIPAA + FedRAMP High as 2026 contracting targets. The structural read isn't 'AI is hot.' It's that enterprise gravity has visibly tilted, the Mythos drop is on a calendar, and the stack decision your team has been deferring just got easier and more urgent in the same week. Here's what changed about the buying environment, what's now visible on Anthropic's roadmap, and what it means for any product team still sitting on the Claude vs. GPT vs. Gemini fence.

Surge AI — founded in 2020, bootstrapped, no venture capital — crossed $1.2 billion in annualized revenue in 2024 and was valued north of $25 billion by mid-2025, fed almost entirely by ~12 frontier AI labs each spending roughly $1 billion a year on human-generated training data. Scale AI, the better-known competitor, sat at $870 million over the same window before its partial Meta acquisition triggered a client exodus. The data labeling market is forecast to grow from $4.87B in 2025 to $29B by 2032 at 29% CAGR. The pattern underneath those numbers is the part that matters: frontier labs discovered that training models on their own outputs creates feedback loops, and the only break in the loop is fresh, expert-verified human evaluation. The era of cheap volume annotation is over. The new scarce input is domain expertise — a securities attorney red-teaming a legal agent, a radiologist ranking a vision model, a senior engineer evaluating production code. Here's how the economics actually changed, and what it means for any company trying to train a model on its own domain.

By February 2026 MCP had crossed 97 million monthly SDK downloads, A2A had grown to 150+ participating organizations, and both protocols plus ACP now sit under Linux Foundation governance. Over 100 enterprises have formally adopted both. Anthropic, OpenAI, Google, Microsoft, and Amazon all ship them. The two-layer stack — MCP for vertical tool integration, A2A for horizontal agent coordination — has stopped being a 'standards debate' and become the architectural default that procurement, security, and engineering teams expect to see in any agent system they buy or build. For any company still rolling its own integration layer in 2026, the calculus just inverted: the custom adapter that was a competitive moat in 2024 is now a maintenance liability that locks you out of every protocol-native tool, agent, and audit surface vendors are about to ship. Here's what the convergence means in practice, and what to build above the protocol stack to actually capture the value.

On May 28, 2026 Anthropic shipped Claude Opus 4.8 alongside Dynamic Workflows in Claude Code — a feature that lets the model write its own orchestration scripts and spin up tens to hundreds of parallel subagents in a single run, capped at 16 concurrent and 1,000 total. Same price as Opus 4.7. Fast mode is now 3× cheaper. The model is 4× less likely to leave a code flaw unflagged, scores 0% on uncritically reporting flawed results, and the new effort-control UI lets you dial reasoning budget like a knob. The headline numbers are the easy story. The structural story is that orchestration — the layer every serious AI engineering team has been hand-building for a year — just became a product primitive shipped by the model vendor. That changes what's worth building yourself, what's worth buying, and where the real differentiation lives. Here's what 4.8 actually changes for teams shipping AI features into products.

By 2026 the global human-in-the-loop AI market is on track to cross $17 billion, with each major frontier lab spending roughly $1 billion a year on human-generated training data — and the rate card for that work has six rungs: data annotators at $15–25/hr, AI tutors at $20–55/hr, RLHF specialists at $50–65/hr, prompt engineers at $40–65/hr, red teamers at $100–200/hr, and domain expert evaluators from $130 all the way up to $1,000/hr. The radiologist ranking model outputs on chest CTs is billing more than your senior engineers. So is the securities attorney red-teaming a financial agent. The cheap-annotation era is over — at both ends of the funnel — and the work that produces frontier-quality enterprise models is expert judgment, not labeled volume. For any company trying to train a model on its own domain, the operating model that wins in 2026 looks more like staffing a clinical-trials team than running a labeling vendor. Here's what changed, and what to build before you scale your training spend.