Sonnet Code
← Volver a todos los artículos
Developer Tools2 de mayo de 2026·9 min read

Thirty Days Inside Cursor 3: The IDE Is an Org Chart Now

The IDE got rebuilt around an org chart

Thirty days into Cursor 3, the shift is clearer than it looked at launch. On April 2, Cursor replaced the Composer pane with a full-screen Agents Window — a tiled workspace where developers run multiple AI agents in parallel across local machines, git worktrees, remote SSH boxes, and cloud sandboxes. Agents kicked off from mobile, the web app, Slack, GitHub, or Linear all surface in a single sidebar, regardless of where they're actually executing.

What looks at first like a UI redesign is actually a re-framing of what an IDE is. For two decades the IDE was a single-user surface for editing files. Cursor 3 makes it a control surface for running and supervising a small team of agents — some of which the developer kicked off from a phone in a coffee shop and forgot about. The text editor is still there. It's no longer the center of gravity.

What the Agents Window actually changed

Three concrete shifts, each of which has implications past the demo video:

Parallelism stopped being optional. The previous Composer flow assumed one agent at a time, in front of you, in the file you were editing. The Agents Window assumes the opposite: you have four agents running, on four different repos or four worktrees of the same repo, and your job is to triage their outputs. The keyboard shortcuts, the diff navigation, the empty-state branch picker — all of it is built around the assumption that the human is reviewing more than they're typing.

Where the agent runs decoupled from where the developer sits. A local CLI session can be teleported to the cloud mid-task; an SSH session can be promoted to a long-running cloud agent; a worktree-bound agent can ship its diff to the developer's laptop for review. The mental model the developer needs is "where is this work executing right now?" — not "where do I open the file."

The unified sidebar made async work visible. Pre-Cursor-3, an agent kicked off from Slack and an agent kicked off from the IDE were two separate workflows on two separate surfaces. Now they're rows in the same list, with the same status indicators and the same review affordances. That doesn't sound like much until you realize how much agent work was previously happening in places no one was watching.

What changes about engineering practice

We've now reviewed enough Cursor 3 codebases to see a few patterns harden into best practice and a few habits start producing technical debt at scale.

The patterns that worked:

  • Short-lived, narrowly-scoped agents. Teams that get the most leverage out of the Agents Window run a lot of short, well-specified agent tasks rather than a few long, ambitious ones. "Migrate the test fixtures in this directory to the new factory" beats "modernize the test suite." Smaller scope means cleaner diffs, faster review cycles, and fewer cases where the developer has to abandon work mid-task.
  • One worktree per agent. Conflicts between parallel agents on the same files are the dominant Cursor 3 failure mode. Teams that standardized on git worktree add per agent task, with branch names that include the agent's session ID, almost never see this fail. Teams that didn't ended up reviewing two agents' contradictory diffs against the same file and reverting both.
  • Reviewing before merging, not after. The Agents Window makes it tempting to let agents auto-commit. In practice the teams shipping production code review every diff inside the IDE before it leaves the developer's machine. The agent does the typing; the senior engineer still owns the merge.

The habits that produced debt:

  • "Set and forget" agents on long-running cloud sessions. Teams that spawned long-horizon agents from Slack and never came back to them ended up with diffs that took longer to review than the agent took to write. The agent's productivity looked high; the team's throughput dropped.
  • Skipping the spec. A vague prompt to a parallel agent fleet produces four mediocre solutions to four slightly-different interpretations of the problem. Teams that wrote a one-paragraph spec before kicking off agents — even a bad spec — got measurably better outputs than teams that didn't.
  • Senior eyes on junior tasks. The temptation, with cheap parallel agents, is to throw all the boring work at the agents and free the senior engineer for the interesting work. The pattern that actually works is the inverse: agents handle the boring work and the senior engineer reviews their output, because that's where the real engineering judgment shows up. Teams that moved seniors to "interesting greenfield" while juniors triaged the agent fleet shipped more bugs to production. Reviewing AI-generated code is not a junior task.

What this means for hiring and team shape

The agents-as-IC framing makes the team-shape question sharper. Three things we keep seeing:

  1. The leverage point is review, not authorship. The constraint on shipping is no longer how fast a developer can type; it's how fast a senior engineer can correctly review four agent diffs in parallel. That makes review skill — taste, architectural memory, the ability to spot a subtle wrong abstraction at a glance — the scarce resource.
  2. Junior engineers don't get cheaper because agents got cheaper. They get more important, because someone has to learn to do the review work, and you can't skip from new-grad to staff engineer without doing the time on the codebase. Teams that hired against agent productivity and stopped backfilling juniors are going to discover, in three years, they have no senior bench.
  3. Pair programming is back, just not with humans. The Cursor 3 workflow that produces the cleanest code looks like a developer and an agent in tight back-and-forth on a tiled pane — spec, run, review, refine, commit — for forty minutes, then a different developer-agent pair on the next task. Solo coding is still possible. It's no longer the default productive shape.

Where the floor is now for "AI-fluent" engineering teams

If a buyer is evaluating an engineering vendor today, "do your engineers use Cursor" is a question that's already shifted to "how do they run their agent fleet?" The vendors that can answer cleanly — review discipline, worktree hygiene, spec-before-spawn, senior-on-review — clear that question fast. The vendors that say "yes we use Cursor" without a workflow behind it are going to lose deals to vendors that can describe the workflow in two minutes.

Sonnet Code's take

Our engineers ship in Cursor 3. The discipline we run on it is the same discipline we'd run with any tooling: senior-only review, worktree-per-task, written specs, no auto-merges into mainline. The Agents Window made all of that more visible, not less necessary. If anything, the leverage of agent fleets makes senior judgment scarcer per shipped line of code, which is the opposite of what the marketing copy on most coding-agent products implies.

That's the same reason we keep our team sized to stay senior. Multi-agent IDEs make junior-heavy teams slower, not faster, because nobody is left at the level where review actually catches the failure modes the agents introduce. If you're standing up an agent-augmented engineering practice and want a partner that's already past the early-Cursor-3 mistakes, talk to us — the playbook is hard-won and we'd rather hand it over than watch another team relearn it.