Working with a LLM agent is... good?

Over the past month, I’ve basically done a 180 degree turn on my opinions on using LLM agents. I’m going to try to capture some of the things I’ve learned since I’ve shared them with about four people who said they found them helpful.

This blog is written entirely by hand. It’s an artisanal blog <smug>.

Why I tried it

Oh, this one is easy. My boss told me to.

He made a good point. All of our customers are using agents to interact with our work. If we don’t make sure our product works well with agents, then we will fall behind our competitors. Customers will not buy the product. There is no better way to understand what a customer needs when they use an agent then by learning to use an agent effectively.

A little over a month ago, based on this guidance I started more and more to approach each task with an agent-first approach.

What wasn’t working

I was trying to do most of the heavy lifting myself and just tell the agent go do menial tasks. This was terrible. I work in rust which has some pretty bad compile times. The agent would happily work away at some kind of refactor I directed them to do, but it would take 10-20 minutes frequently for something that really shouldn’t take very long to do. Now that was 10-20 minutes where I could work on other things and not have to monitor it.

This wasn’t working because every 10-20 minutes I would have to context shift as a small part of the work got finished, I checked on it, and then sent it off to do something else.

That context switching every time an agent finished was killing my productivity. I was constantly going back and forth all day.

How I learned to work with the agent

In two words: plan mode.

I’m using Claude, and specifically Claude Code, for most of my work. I stopped trying to do all the heavy thinking myself and instead started to have a discussion with the agent. I switched into plan mode and started to give long descriptions about the problem I am trying to solve, what I think is the probably right way to go about it, and what I think some of the pitfalls are going to be.

This one small change was a mental flip for me. Now instead of me directing every piece of work to get to the end state, I’m now working with the agent to have a long running plan that I intend to leave on it’s own. Part of the plan is intermediate checks for how we know we’re doing the right thing.

When the agent came back with the plan, I tried to treat it as if my arch nemesis came up with the plan. I looked for things to find wrong with it. I wanted to interrogate the agent with as much criticism as I possibly could.

Eventually, these conversations became… fun‽

I realized I was having a good time. I was spending time working on the parts of software design that are the most fun for me. I’m building solutions and plans on how to get there.

After enough back and forth with the agent, eventually we had a plan that I looked at and said, “that’s as good as what I would do.”

Then I let it chomp away, sometimes for hours, on the plan. I only started my review after it had a full working solution.

Some tips I found along the way

Test driven development really works with agents. I started using the agent to design the test criteria and not just telling it to add unit tests after the work was complete.

I can use nono with claude and live dangerously without worry. Here’s my command:
nono run --allow . --profile claude-code-base -- claude --dangerously-skip-permissions

In that command claude-code-base is a custom profile where I’ve given it permission to a handful of folders that I’m comfortable if it messes up.

When I make a plan, I have the agent save the plan out to a markdown file. I use this both for myself to review and to use an agent to evaluate the agent’s work. I open a clean session so that it has no prior knowledge of what we did in the previous session. Then I feed it the markdown and ask the agent to evaluate the merits of the plan. This works really well!

The agents get lost along the way. Especially for long running sessions, it’s not uncommon for them to get caught on side tasks. I have to keep that same questioning attitude that I had on making the plan during the entire session.

Closing thoughts

Using Claude Opus has really changed the way I’m approaching coding tasks. I’m finding that I’m spending more time on the “is this the right thing to do” concepts and less time on the “is this the right way to do the thing”. For me, it’s an important distinction. I think the agents are pretty good at the latter but can easily loose sight if we’re doing the right thing.

I do worry about anchoring. I worry that when the agents can give us such detailed answers with very little work that we sometimes fall into the trap of assuming their starting point is the right one. I think this happens in all of our discussions, but when these agents give you a starting point plus a detailed plan on how to do that, it’s very easy to just go along with it.

So I guess working with an agent is making me more cynical about the work and happy to do it. Is that a good combo? ¯\_(ツ)_/¯

Working with a LLM agent is… good?

Why I tried it

What wasn’t working

How I learned to work with the agent

Some tips I found along the way

Closing thoughts