Agents & Orchestration Intermediate
MAX_THINKING_TOKENS Tuning
Cap extended thinking token budget for predictable CI costs
Command
$ MAX_THINKING_TOKENS=8000 "color:#7C5CFC">claude -p "Refactor auth.py" "color:#d97757">--output-format json
Response
{
"result": "Refactored auth.py...",
"usage": { "output_tokens": 8000 },
"total_cost_usd": 0.65
} Parsing Code
059669">">// Extended thinking at Opus pricing ($75/M output): 059669">">// 10K tokens = $0.75 per response just 059669">">for thinking 059669">">// 20K tokens = $1.50 per response 059669">">// Cap with MAX_THINKING_TOKENS 059669">">for predictable costs 059669">">// 059669">">// Combine: --effort high + MAX_THINKING_TOKENS=8000 // = full reasoning within a cost ceiling
Gotchas
! Extended thinking can consume 10K+ tokens per response ($0.75+ at Opus rates)
! Combine with --effort for maximum control: --effort sets quality, MAX_THINKING_TOKENS caps cost