Losing Control Of Our Coding Agents

What Happens When Your Ambitions Are Bigger Than Your Tooling

Mar 05, 2026

The history of software engineering is the history of moving up the stack. We abstracted the hardware, then the memory management, then the infrastructure. And, for the past year or so, we’ve been abstracting the programming itself.

At the end of 2025, we described our AI-assisted development workflow as beekeeping. We’d spin up agents in parallel, feed them specs, and spend our time reviewing output instead of writing the code ourselves. The bees were making the honey. We were keeping the hives healthy.

This flow enabled our 5-person engineering team to consistently ship ~200 features every month. As we entered 2026, our ambitions grew. What could we change to get this same team to ship ~800 features every month?

Current tooling didn’t support our ambition, so we had to start building our own. This is what Steve Yegge eventually called “Stage 8”.

More Honey Than We Could Handle

Our team has become obsessed with shipping code.

One of our engineers caffeinates his laptop running in his backpack1 during commutes, tethered to his phone, so he can Slack with the agents while they work2.

Between the start of 2025 and the end of 2025, my AI coding experience went from multi-line AI suggestions in VS Code to juggling 10-15 agents each working on different tasks.

Going into 2026, we’d gotten quite good at keeping bees, but we were starting to hit the limits of what we could manually manage. The bees were able to do more work; we just couldn’t keep up with them. I found myself constantly switching from one session to the next, reviewing progress and trying to keep agents unblocked.

We were outgrowing our current workflows. We wanted better tools that could help us track everything in one place, coordinate agents working towards the same goal, have multiple goals underway in parallel, and have humans efficiently review all of this work.

We needed an apiary.

Honey With A Hint Of Gas

The timing was almost comical. Just days after we’d been brainstorming solutions, Steve Yegge published Welcome To Gastown, a tool that was attempting to do much of what we’d been hoping for.

I’ve never had so many conflicting feelings about a piece of technology. Working with Gastown was simultaneously terrifying and awe-inspiring, draining and invigorating, maddening and entertaining3. Very much to Steve’s credit, he tried to warn me4.

But the promise of Gastown was incredibly compelling. I could describe work for any number of tasks and off they’d go to be implemented somewhere. I could get status on everything in flight and jump directly to any agents that needed help, all from a single window. Now I could spend more of my time thinking about tasks and PRs instead of tracking terminals and worktrees.

The warnings not to use Gastown were fair though. Bizarrely named branches started to appear in our GitHub. My commits were attributed to furiosa, nux, and chrome. PRs would open that I didn’t ask for, and would be reopened or recreated after I closed them. New versions of Gastown appeared almost daily and I recreated my workspace almost as frequently.

There were plenty of rough edges, but the shift in abstraction is what I kept coming back to.

I gave my team a tour. They were a bit wide-eyed, skeptical and bemused. But they got the idea and saw the promise. From our Slack:

[10:47 AM] ok yeah i flew too close to the sun and those gassy boys burnt me good
[10:47 AM] but it was illuminating

Gastown was someone else’s apiary with a very specific (and entertaining) vision for how agents should be organized and coordinated. It wasn’t quite right for us but it opened our eyes to what running a larger-scale operation could look like.

Self-Improvement

We took inventory of our bottlenecks around task management, agent management, review management, etc. and did what we do best - had our agents build better tooling for us.

Beantown: Helps with dispatch
Pulls tickets from Linear, breaks them into agent-sized specs, and farms them out to available workers.
Coal Harbour: Helps with multiplexing
An app to tame the cross-product of (features × worktrees × terminals × agents).
Prism: Helps with code review
Uses parallel agents to do focused reviews on specialized areas (e.g. security, architecture, style) and speed up human review.
Lux: Helps with agent coordination
Simpler Gastown-inspired primitives for customizing and extending how groups of agents coordinate on shared goals.

Some of these tools are pretty custom to how we work and only make sense within our walls, but some we’ll be releasing for others to use.

Each has helped us scale from beehives to apiaries.

Apiary Innovation

We’re far from the only ones exploring here. This space is still young and new takes are popping up daily. The changes in the process of writing code have outpaced the tools used to write it. No one’s entirely sure what the correct abstractions even are yet, but we’re starting to see the outlines forming more clearly.

One common thread: parallel specialized agents that can coordinate on a common goal are the focus heading into 2026.

Different agents have different strengths and capabilities. Having 2-3 agents design a feature or review the same PR will often lead to more comprehensive results than relying on just one: one catches a race condition, another notices an API contract violation, a third might focus on improved test coverage.

Tools that facilitate using all of the different agents also seem to be the right call in 2026. Models are constantly leapfrogging each other, so it’s beneficial to not be beholden to one family of models if you want to stay on the frontier. Different agents have different personalities and strengths, which are hard to discern until you’ve seen how they operate in your codebase.

We like Claude for spec-writing, Codex for reviews, Gemini for task research.

Commercial beekeepers move their hives from Florida to California to chase the almond bloom, and to Washington for apples. You should always be moving your hives to maximize your honey.

From Beekeeper to Apiary Manager

Two days after I’d first learned of Gastown, my boss and I had our first walking 1:1 of the new year. I was three weeks from my one-year anniversary at Logic, and he asked how the last year had been. My answer: it was the most exhilarating, growth-inducing year of my career, with the possible exception of my first year out of school. The way I build software now is profoundly different from how I built it just 12 months ago.

In 2025, engineering became beekeeping. In 2026, the frontier isn’t the bees. It’s the infrastructure around them. Nobody has the apiary figured out yet, but a lot of people are building, experimenting, and sharing what they’ve learned.

If you’re building something similar, we’d love to hear about it either in the comments or at ben@logic.app. We’re all figuring this out together and would love to compare notes.

Don’t tell AppleCare.

We have agents that run in the cloud too, but we all know local is better.

A sentence that actually came out of my mouth: "One sec, I just need to sling a couple of convoys so that my polecats can stay busy while we're getting lunch."

The phrase "Do not use Gas Town" appears four times in Steve's post.

Bits of Logic

Discussion about this post

Ready for more?