My Coding Agent Bill Just Hit $250 A Day. Here’s What I Plan To Do About It.

As the CTO of our small custom software company, I’m the one who does the most R&D. And so I’ve been working out how best to integrate AI coding agents into our development cycle.

By now, I’ve set up a comprehensive set of global rules that take us — me and My AI — all the way from creating tickets, working on them, creating a fix or feature branch, committing and pushing the work, adding comments to the ticket, setting the ticket to the appropriate status, creating instructions for QA, to finally creating a worklog entry for the ticket so I can report what I did at our morning standup.

Learning how to work with My AI involves doing real work on real projects. I’m running multiple agents simultaneously, and that, as I found to my dismay, adds up pretty quickly.

It was just recently that I needed to up my monthly spend to $400 monthly. Then I needed to up it again, to a threshold of $1000.

I reached that threshold in 5 days.

— — — — — — — -

The coding agent I use has brought me a lot, but costs are rising astronomically.

— — — — — — — -

Coding agents are defined by the interface they provide to their underlying coding models. Many developers use an IDE, so Cursor is the agent of choice for users of PyCharm or VS Code.

I’m old and old-fashioned. I don’t use an IDE, I use a simple text editor. And the OSX Terminal.

From there, it was a quick hop to Warp. Actually, I’d been using Warp as an extended Terminal for quite a long time before I started using it’s Agentic AI features.

The people at Warp are really active. Every couple of weeks updates are released, by now there’s a dizzying array of features and settings. Most of which I’ve never used or was even aware of.

It was only after I’d gotten spooked by high inference costs that I took a closer look.

And learned two crucial things: a) Warp surcharges inference, by a lot, and b) you can reduce costs by running models directly, but only those of Anthropic and OpenAI.

— — — — — — — -

Using Mistral models can reduce costs by a staggering forty times.

— — — — — — — -

I get it, they have to make money some way. But the differences in costs are verging on the ridiculous.

For example, one days work on a single ticket cost almost 9000 credits, which is, in the convulsed system of surcharges and discounts that is Warp, about $80. How much cheaper would Mistral Vibe be?

That 9000 credits gives me about 1.6 million tokens, with different cost per million tokens for input and output, assuming an equal input/output rate. For interests sake, I included the costs of the option Warp provides to insert your own Claude or OpenAI GPT key.

So here goes:

Warp (Claude Sonnet), with the surcharge, but also the reload discount:

input/m: $4.50, output/m $22.50, 1.6m tokens = $80

Direct Claude API (with BYOK)

input/m: $3.00, output/m $15.00, 1.6m tokens = $30

Mistral Vibe (Devstral 2)

input/m: $0.40, output/m $2.00, 1.6m tokens = $2

Yikes!

— — — — — — — -

Vendors providing interface for Agentic AI are a trap. Use your coding agent to roll your own.

— — — — — — — -

As I said I was using Warp as an enhanced terminal emulator long before I started using its AI features. It turns out however that to use Mistral’s Vibe CLI I’ve got to let Warp go.

Which is a great pity because of the effort put into integrating Warp with our software development lifecycle.

It would be great to be able to insert your Devstral 2 key into Warp but that option was removed. So much for the aspirations of the new generations of cloud software services.

We’ll just need to start again. But not from scratch.

Yesterday I used Claude Sonnet rewrite Warp’s global rules into a set of shell scripts that are to be used with Mistral Vibe CLI.

— — — — — — — -

We’re rebuilding our complete software development lifecycle in our own software.

— — — — — — — -

Here’s a thought: if I can use more expensive coding agents to build far more customised and far cheaper coding solutions, can’t we do the same for the SaaS solutions we’re using?

Just like any company, the costs of third party platforms are eating into our budget, while we use but a fraction of their features.

Every before AI, we were building our own custom software, for internal operations but also for services to our clients.

Yes, there’s a cost to that too, the security and maintenance now falls on you. What if autonomous agents living on the servers could — safely — do the necessary updates?

Maybe it’s a dream. But as a small software company deeply invested in AI technology and automation, we’re uniquely positioned to try.