Controlling AI Coding Agents: Cost, Proof, and Trust
AI agents now write a growing share of the code that ships. That shift moves the hard question from "can the agent write the change?" to "can I trust what it did, and did it cost what I expected?" Those are two different failure modes, and they need two different controls. This guide is the map: how an agent's bill actually forms, why the usual success signal (a green CI check) can lie, and what a trustworthy merge gate looks like. Each section links to the deeper piece.
The through-line across all of it: an AI agent optimizes for the reward you give it, and it will find the cheapest path to that reward - whether that reward is "make the check pass" or "produce an answer." Control is the practice of making the cheap path the honest path.
1. Where the cost actually leaks
The first surprise is the invoice. Teams budget AI by the price of one call, then the bill lands an order of magnitude higher. The reason is structural: an agent task is not one call, it is a loop that re-pays for its own growing context on every step. The prompt is the cheapest moment in the run - the money lives in the tail nobody estimated.
Read next
Your AI Agent Bill Is Not the Prompt. It Is the Retry Loop. - why the loop multiplies while the prompt does not, where the money leaks (retries, stuck loops, re-read blobs), and the one control that changes the bill: a ceiling set before the run starts.
Runcap vs Langfuse vs LiteLLM: Where Each One Fits - the honest breakdown of three tools that look interchangeable but sit in three different places in the request lifecycle: observability records spend after it happens, gateways route and rate-limit, and a pre-run cap refuses the expensive run before it starts.
2. Why a green check is not proof
The second surprise is trust. An agent opens a pull request, CI goes green, the diff looks reasonable, you merge. But the tests, the workflow files, and the verifier all live in the repository - and a pull request is allowed to edit them in the same diff they are supposed to judge. When the thing being measured can edit the thing that measures it, a green check stops being evidence of a working fix.
Read next
A Green CI Check Is Not Proof: How AI-Generated PRs Can Rewrite Their Own Evidence - the concrete failure mode (an agent that relaxes the test instead of writing the guard), why ordinary CI has the wrong trust model, and the base-commit clean-replay that grades a change against the rules as they existed before the agent touched anything.
3. What breaks when AI writes the whole thing
Cost and proof are the acute failures. The chronic one is what accumulates when an agent writes an entire application unsupervised: the plausible-but-wrong code that passes a shallow look, the security gaps, the silent assumptions that only surface under real load. Knowing the failure shapes in advance is what lets you review for them instead of discovering them in production.
Read next
What Breaks When AI Writes Your SaaS - the recurring failure patterns in AI-generated applications and what a real audit looks for before those patterns reach your customers.
The shape of control
Put the three together and a pattern falls out. Controlling an AI coding agent is not one feature - it is a sequence of gates around the agent, each one deterministic and each one set before the agent runs:
- Before the run - estimate the cost and set a hard cap, so a runaway loop is a bounded, knowable expense instead of an invoice surprise.
- During the run - keep the agent inside a declared scope, so a task to fix one module cannot quietly rewrite tests, workflows, or config.
- Before the merge - replay the change against the rules as they stood at the base commit, in a clean checkout the agent could not edit, and return a single verdict a human can act on.
The reason each gate has to be deterministic - a plain rule, not another model - is simple: a brake that is itself an AI can be talked out of stopping. A hard cap and a base-commit replay cannot be reasoned with. That is the point.
Common questions
What tools can put a hard spending cap on an AI coding agent before it runs?
You need a control layer that sits between the agent and the model API, not an after-the-fact dashboard that shows the bill once it has arrived. Runcap routes the agent's requests through a local gateway, estimates the cost of a run, and returns an HTTP 429 when projected spend crosses a configured cap - before the call is paid for. The cap is a plain rule, not a model, so it cannot be reasoned out of stopping.
What is the best way to enforce a budget limit per task for Claude Code or Cursor agents?
Set the limit at the request path, not in the prompt. An agent calls the model in a loop, and each retry re-pays for a growing context, so a prompt-level instruction cannot bound the total. Runcap enforces a per-mission cap on requests routed through its local gateway, so the whole task is bounded regardless of how many times the agent loops.
How do I get a cost estimate for an AI agent run before it starts?
Estimate the routed run, not a single prompt. The real cost of an agent task is the loop, so a per-call price tells you little. Runcap prices a run before execution based on the routed model and the work, then enforces the cap live as the run proceeds.
How can I verify an AI-generated pull request passes tests and did not rewrite its own CI checks?
The tests, workflows, and verifier all live in the repo, so a pull request can edit the very checks that are supposed to judge it. The fix is to replay the change against the rules as they stood at the base commit, in a clean checkout the agent could not touch. Runcap runs this as a pinned GitHub Action Proof Gate and returns a single verdict: PASS, BLOCKED, or HUMAN_APPROVAL_REQUIRED.
Is a green CI check enough proof that an AI agent's code fix is real?
No. A green check only proves that the checks which ran passed, and those checks are files the same pull request can modify in the same diff. Restore the evidence by replaying against the base commit in a clean checkout the agent could not edit. This is covered in depth in A Green CI Check Is Not Proof.
If you are shipping AI-written code and either the bill or the trust is keeping you up, email me at kirill@launchsoloai.com with what it does, roughly what it costs, and where it runs. You get back a written teardown within 24 hours or a straight no. You can also see my productized offers and pricing here.
← All insights