The Setup
Set a tight budget on an expensive prompt:
claude -p "Write a 5000 word essay about the history of computing" \ --output-format json \ --max-budget-usd 0.05Two minutes later, the full essay came back. The bill: $0.152 — 3x the budget. Claude finished the entire generation before the budget check fired.
Why This Happens
The --max-budget-usd flag is not a hard cap. It is a between-turn check. Here is the actual enforcement sequence:
- Claude starts generating a response
- The full response completes (one turn)
- Then the budget is checked
- If exceeded, the next turn does not start
Step 2 is the problem. A single turn runs to completion before any budget check fires. If that turn generates 5,446 output tokens over 126 seconds, the budget flag can only watch.
Here is the JSON response:
{ "type": "result", "subtype": "error_max_budget_usd", "is_error": false, "total_cost_usd": 0.1521415, "modelUsage": { "claude-opus-4-6": { "outputTokens": 5446, "cacheReadInputTokens": 14253, "costUSD": 0.1521415 } }}Notice is_error is false. This is not a crash. It is a controlled stop that happened after the money was already spent.
The Minimum Floor
Every Claude CLI call has a minimum cost from the system prompt alone. For Opus, that floor is approximately $0.016 per call — about 14,253 cached tokens read before Claude even sees your prompt.
Any budget below this floor is guaranteed to be exceeded:
| Budget | Actual Cost | What Happens |
|---|---|---|
| $0.01 | $0.017 | System prompt floor exceeds budget |
| $0.05 | $0.016 - $0.15 | Works for short answers, exceeds on long ones |
| $0.50 | $0.02 - $0.48 | Reliable for single-turn tasks |
| $2.00+ | Near budget | Budget controls work as expected |
Below $0.02 with Opus, the budget flag is just a “run once and stop” mechanism.
The Fix
Set realistic minimums. For Opus single-turn tasks, use at least $0.10. For multi-turn sessions, $1.00 or more.
Detect overshoot in automation. Check the exit code and subtype field:
RESULT=$(claude -p "Your prompt" --output-format json --max-budget-usd 1.00)SUBTYPE=$(echo "$RESULT" | jq -r '.subtype')COST=$(echo "$RESULT" | jq '.total_cost_usd')
if [ "$SUBTYPE" = "error_max_budget_usd" ]; then echo "Budget exceeded: spent $COST"fiTrack cumulative costs externally. The budget flag tracks per-session spend, but for CI/CD pipelines running hundreds of calls, you need a wrapper that accumulates total_cost_usd across invocations and kills the pipeline when a threshold is hit.
Use Sonnet for cheap tasks. Sonnet’s minimum per-call cost is $0.005 — roughly 3x cheaper than Opus. For formatting, summarization, and simple questions, --model claude-sonnet-4-6 is the right call.
The Mental Model
Think of --max-budget-usd as a governor on a car, not a brick wall. It limits how many laps you take, but it cannot stop the car mid-lap. Each lap (turn) runs to completion, and the governor only checks between laps.
For hard real-time budget enforcement, you need external tooling. The CLI’s budget flag is a useful guardrail, not a billing guarantee.