Skip to content

CI/CD Integration

GitHub Actions, custom pipelines, retry logic, and CI flags

15 min read

Your GitHub Action with Claude works perfectly — until a rate limit hits mid-review, the step fails, and the entire pipeline blocks for 20 minutes. Running Claude in CI means planning for failures you don’t see in interactive mode.

Claude Code runs headless in CI/CD pipelines for automated code review, test generation, documentation, and more. The official GitHub Action provides turnkey integration, while the CLI’s --print mode enables custom pipeline workflows with explicit cost caps and retry logic.

GitHub Action

The anthropics/claude-code-action@v1 action wraps the CLI with sensible CI defaults. It auto-detects whether it is running on a PR or issue event, injects the diff context, and posts Claude’s response as a comment.

GitHub Action — PR Review Workflow
$ # .github/workflows/claude-review.yml
$ cat .github/workflows/claude-review.yml
name: Claude PR Review on: pull_request: types: [opened, synchronize] jobs: review: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 with: fetch-depth: 0 - uses: anthropics/claude-code-action@v1 with: prompt: | Review this PR for: 1. Security vulnerabilities 2. Performance issues 3. Code style violations anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }} max_budget_usd: 1.00

The action handles checkout, context injection, and comment posting automatically. You supply a prompt, your API key, and an optional budget cap. For more control, you can install the CLI directly and run it yourself.

Custom CLI Pipeline in Actions
$ # Custom CLI pipeline in GitHub Actions
$ npm install -g @anthropic-ai/claude-code
added 1 package in 4s
$ RESULT=$(claude -p "Review changes for bugs" \ --output-format json \ --max-budget-usd 0.50 \ --no-session-persistence \ --permission-mode bypassPermissions)
$ echo "Cost: $(echo "$RESULT" | jq -r '.total_cost_usd')"
Cost: 0.016079

CI Flags

Every CI invocation needs a specific set of flags to run safely in a headless environment. Here is the breakdown.

Essential CI Flags

FlagPurposeWhy It Matters in CI
-p / —printHeadless modeRequired — no interactive UI in CI
—output-format jsonMachine-parseable outputLets you extract cost, session ID, and result with jq
—max-budget-usd NCost cap per invocationPrevents runaway spend — costs can vary 10x for the same prompt
—no-session-persistenceDon’t save session to diskKeeps ephemeral CI runners clean — no leftover session files
—permission-mode bypassPermissionsNo interactive promptsWithout this, the CLI hangs waiting for TTY input and eventually times out
—strict-mcp-configFail if any MCP server can’t connectWithout it, a failed MCP connection is silently ignored — subtle, dangerous
—fallback-model MODELUse alternate model on overloadOnly triggers on HTTP 429 rate limiting, not other failures
—from-pr URLResume session linked to a PRRequires GITHUB_TOKEN in the environment

The minimum viable CI invocation combines four of these flags:

Terminal window
claude -p "Review this code for bugs" \
--output-format json \
--max-budget-usd 0.50 \
--no-session-persistence \
--permission-mode bypassPermissions
Try This

Add budget protection to a CI command and see what happens:

claude -p “Review this PR for security issues” —max-budget-usd 0.50 —output-format json | jq ‘{subtype, cost: .total_cost_usd}’

Did the review complete within budget? What would your script do if subtype came back as “error_max_budget_usd”? Plan for that case now — before it happens in production.

Custom Pipelines

Beyond GitHub Actions, you can embed Claude in any CI system — GitLab CI, Jenkins, CircleCI, or a plain bash script. The pattern is the same everywhere: capture JSON output, parse with jq, and act on the result.

Cost-Capped PR Review:

Terminal window
claude -p "Review the changes in this PR" \
--from-pr "$PR_URL" \
--output-format json \
--max-budget-usd 0.50 \
--no-session-persistence \
--permission-mode bypassPermissions

Batch File Processing:

Terminal window
for file in src/**/*.ts; do
claude -p "Add JSDoc comments to $file" \
--output-format json \
--max-budget-usd 0.10 \
--no-session-persistence \
--permission-mode bypassPermissions
done

Each invocation is fully isolated — no session state leaks between files.

Plan-Review-Execute in CI:

A three-step workflow: generate a plan, post it for human review, then execute on approval.

Terminal window
# Step 1: Generate plan
PLAN_RESULT=$(claude -p "Fix all lint errors in src/" \
--permission-mode plan \
--output-format json \
--max-budget-usd 0.50)
# Step 2: Post plan as PR comment
PLAN=$(echo "$PLAN_RESULT" | jq -r \
'.permission_denials[] | select(.tool_name == "ExitPlanMode") | .tool_input.plan')
gh pr comment "$PR_NUMBER" --body "$PLAN"
# Step 3: On approval, execute
SESSION=$(echo "$PLAN_RESULT" | jq -r '.session_id')
claude -p "yes, proceed" \
--resume "$SESSION" \
--permission-mode bypassPermissions

In plan mode, Claude’s proposed changes appear as permission_denials with tool_name: "ExitPlanMode". The plan text lives in the tool_input.plan field.

CI Response Payload

Here is a real response from a CI run with --no-session-persistence, --max-budget-usd 0.50, and --output-format json.

CI Run — Budget Cap + No Persistenceartifacts/14/ci_flags_test.json
1{
2 "type": "result",
3 "subtype": "success",A
4 "is_error": false,
5 "duration_ms": 1513,
6 "duration_api_ms": 1497,
7 "num_turns": 1,
8 "result": "4",B
9 "stop_reason": "end_turn",E
10 "session_id": "94d02fdc-9f01-4b5c-8bc9-8f08e684d8a3",C
11 "total_cost_usd": 0.016079,D
12 "usage": {
13 "input_tokens": 3,
14 "cache_creation_input_tokens": 1410,
15 "cache_read_input_tokens": 14253,
16 "output_tokens": 5,
17 "server_tool_use": {
18 "web_search_requests": 0,
19 "web_fetch_requests": 0
20 },
21 "service_tier": "standard"
22 },
23 "modelUsage": {
24 "claude-opus-4-6": {
25 "inputTokens": 3,
26 "outputTokens": 5,
27 "cacheReadInputTokens": 14253,
28 "cacheCreationInputTokens": 1410,
29 "costUSD": 0.016079,
30 "contextWindow": 200000,
31 "maxOutputTokens": 32000
32 }
33 },
34 "permission_denials": [],
35 "fast_mode_state": "off"
36}
AAlways check this first in CI — false means the call succeeded
BThe actual output text — extract with jq -r '.result'
CPresent even with --no-session-persistence, but not resumable after exit
DAggregate this across pipeline steps for total CI run cost
Eend_turn means Claude finished naturally — watch for max_tokens if truncated

Key fields to parse in your CI scripts:

  • is_error / subtype — check for "error_max_budget_usd" to detect budget exceeded
  • total_cost_usd — aggregate across pipeline steps for total CI run cost
  • result — the actual output text
  • session_id — present even with --no-session-persistence (valid during process lifetime only)
  • stop_reason"end_turn" means normal completion; other values indicate interruption

Parsing the Result in Bash

#!/usr/bin/env bash
set -euo pipefail
RESULT=$(claude -p "Review this diff for bugs" \
--output-format json \
--max-budget-usd 0.50 \
--no-session-persistence \
--permission-mode bypassPermissions)
# Check for budget exceeded
SUBTYPE=$(echo "$RESULT" | jq -r '.subtype // ""')
if [ "$SUBTYPE" = "error_max_budget_usd" ]; then
echo "::error::Claude exceeded budget cap"
exit 1
fi
# Extract result and cost
REVIEW=$(echo "$RESULT" | jq -r '.result')
COST=$(echo "$RESULT" | jq -r '.total_cost_usd')
echo "Review cost: $COST"
echo "$REVIEW"

The ::error:: prefix is GitHub Actions syntax for surfacing errors in the workflow summary. Replace it with your CI system’s equivalent.

--from-pr PR Review Workflow

The --from-pr flag injects PR context into a session, enabling automated code review:

Terminal window
# Review a PR by number (must be in the correct git repo)
claude -p "Review this PR for security issues" \
--from-pr 42 \
--output-format json \
--max-budget-usd 0.25 \
--permission-mode bypassPermissions
# Review by full URL
claude -p "Summarize what this PR changes" \
--from-pr "https://github.com/org/repo/pull/42" \
--output-format json

Experimentally confirmed behaviors:

  • Both PR number and URL formats are accepted syntactically
  • Not a data pre-loader--from-pr signals to the model that a PR review is requested. The model then uses tools (gh pr view, gh pr diff, git log) to investigate. This means it requires tool permissions and sufficient budget for multiple turns
  • Budget consideration — PR reviews are multi-turn operations. A $0.10 budget may not be sufficient for large PRs. Use $0.25+ for thorough reviews
  • Graceful degradation — invalid PR numbers and non-git directories produce helpful messages, not crashes
  • GitHub auth — for private repos, the gh CLI must be authenticated via GITHUB_TOKEN
Tip

For CI/CD PR review pipelines, combine —from-pr with —output-format json and parse the result field: claude -p “Review for security” —from-pr $PR_NUMBER —output-format json | jq -r ‘.result’

Now Do This

Add —max-budget-usd 1.00 —output-format json to your CI Claude command. Parse subtype in the response — if it’s “error_max_budget_usd”, log a warning instead of failing the pipeline. Budget protection in CI prevents surprise bills without breaking your workflow.

CI/CD Resilience Patterns

Your CI pipeline with Claude works 95% of the time. The other 5%, the API returns a 529, your step fails, and the entire PR workflow blocks for 20 minutes while developers wait. Production pipelines need retry logic, fallback models, and graceful degradation — not just a Claude command.

CI pipelines fail. APIs return 500s, rate limits kick in, networks drop. The --fallback-model flag handles one failure mode (429 rate limiting), but production pipelines need explicit retry logic, error classification, and graceful degradation to stay reliable.

Retry Patterns

The --fallback-model flag only handles 429 overload errors. For other transient failures — network errors, timeouts, API 500s — you need explicit retry logic.

Bash Retry Function

Terminal window
retry_claude() {
local max_retries=3 prompt="$1"; shift
for attempt in $(seq 1 "$max_retries"); do
if RESULT=$(claude -p "$prompt" "$@" 2>/dev/null); then
if echo "$RESULT" | jq -e '.is_error != true' >/dev/null 2>&1; then
echo "$RESULT"
return 0
fi
fi
echo "Attempt $attempt/$max_retries failed, retrying in $((attempt * 5))s..." >&2
sleep "$((attempt * 5))"
done
echo "All $max_retries attempts failed" >&2
return 1
}

The function validates two things on each attempt: that the CLI process exited successfully, and that the JSON response does not have is_error: true. Backoff increases linearly — 5 seconds, 10 seconds, 15 seconds.

Node.js Retry Function

For Node.js pipelines, the same pattern translates directly:

async function retryClaudeSync(args, maxRetries = 3) {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
const output = execFileSync('claude', args, {
encoding: 'utf-8', timeout: 120_000,
env: { ...process.env, CLAUDECODE: '' }
});
const data = JSON.parse(output);
if (!data.is_error) return data;
} catch (err) { /* retry */ }
if (attempt < maxRetries) {
await new Promise(r => setTimeout(r, attempt * 5000));
}
}
throw new Error(`All ${maxRetries} attempts failed`);
}

Advanced Retry with Error Classification

Not all errors deserve retries. Auth failures (401/403) will never succeed on retry. Budget errors mean the work is done but exceeded limits. Only transient failures benefit from retrying:

Terminal window
retry_claude_smart() {
local max_retries=3 prompt="$1"; shift
for attempt in $(seq 1 "$max_retries"); do
RESULT=$(claude -p "$prompt" "$@" 2>&1) && {
SUBTYPE=$(echo "$RESULT" | jq -r '.subtype // ""')
case "$SUBTYPE" in
error_max_budget_usd)
echo "Budget exceeded — not retrying" >&2
echo "$RESULT"; return 1 ;;
success)
echo "$RESULT"; return 0 ;;
*)
# Unknown subtype — check is_error
if echo "$RESULT" | jq -e '.is_error != true' >/dev/null 2>&1; then
echo "$RESULT"; return 0
fi ;;
esac
}
# Check for auth failure (non-JSON exit)
if echo "$RESULT" | grep -qi "unauthorized\|forbidden\|invalid.*key"; then
echo "Auth failure — not retrying" >&2
return 1
fi
echo "Attempt $attempt/$max_retries failed, retrying in $((attempt * 5))s..." >&2
sleep "$((attempt * 5))"
done
echo "All $max_retries attempts failed" >&2
return 1
}

This version distinguishes between retryable failures (network errors, 500s, timeouts) and permanent failures (auth errors, budget exceeded) to avoid wasting time and money on doomed retries.

Retry with Exponential Backoff
time 529 overloaded attempt 1 1s 429 rate limit attempt 2 2s 429 rate limit attempt 3 4s Sonnet fallback model primary model fails --fallback-model
Try This

Test the —fallback-model flag by deliberately triggering a rate limit scenario:

claude -p “Hello” —output-format json —fallback-model claude-sonnet-4-6 | jq ‘{model: .modelUsage | keys[0], cost: .total_cost_usd}’

Which model was used? When the primary model is rate-limited (429), Claude automatically falls back to the cheaper model. Does your pipeline handle both cost tiers?

Fallback Models

The --fallback-model flag provides a backup when the primary model is rate-limited:

Terminal window
claude -p "Review this PR" \
--fallback-model claude-haiku-4-5-20251001 \
--max-budget-usd 0.50

There is a critical limitation here. Fallback only triggers on HTTP 429 (rate limiting). It does not trigger on other failures:

Fallback Trigger Matrix

Failure TypeTriggers Fallback?Your Mitigation
Rate limiting (429)Yes—fallback-model handles this automatically
API error (500)NoUse retry logic with backoff
TimeoutNoUse retry logic with backoff
Auth failure (401/403)NoFix credentials — retries will not help

This means --fallback-model is not a general resilience mechanism. Combine it with the retry function above for full coverage.

Invalid Fallback Names Are Silently Accepted

The CLI does not validate fallback model names at startup. A typo like —fallback-model claude-haku-4-5 is silently accepted — the error only surfaces when a 429 triggers fallback and the model name is unreachable. Check modelUsage keys in the response to confirm which model actually served the request: Object.keys(data.modelUsage).

Full Resilience Pipeline

Combining retry logic, fallback models, error classification, and budget controls gives you a production-ready pipeline function:

#!/usr/bin/env bash
set -euo pipefail
# Production-grade CI wrapper
ci_claude() {
local prompt="$1"
local budget="${2:-0.50}"
local max_retries="${3:-3}"
for attempt in $(seq 1 "$max_retries"); do
RESULT=$(claude -p "$prompt" \
--output-format json \
--max-budget-usd "$budget" \
--fallback-model claude-haiku-4-5-20251001 \
--no-session-persistence \
--permission-mode bypassPermissions 2>/dev/null) && {
SUBTYPE=$(echo "$RESULT" | jq -r '.subtype // ""')
if [ "$SUBTYPE" = "error_max_budget_usd" ]; then
echo "::warning::Budget exceeded on attempt $attempt" >&2
echo "$RESULT"
return 1
fi
if echo "$RESULT" | jq -e '.is_error != true' >/dev/null 2>&1; then
# Log cost for aggregation
COST=$(echo "$RESULT" | jq -r '.total_cost_usd // 0')
echo "::notice::Claude call cost: \$$COST (attempt $attempt)" >&2
echo "$RESULT"
return 0
fi
}
if [ "$attempt" -lt "$max_retries" ]; then
DELAY=$((attempt * 5))
echo "::warning::Attempt $attempt/$max_retries failed, retrying in ${DELAY}s..." >&2
sleep "$DELAY"
fi
done
echo "::error::All $max_retries attempts failed" >&2
return 1
}
# Usage
REVIEW=$(ci_claude "Review this PR for security issues" 1.00 3)
echo "$REVIEW" | jq -r '.result'

This wrapper combines all the resilience patterns: retry with backoff, fallback model for rate limiting, budget cap enforcement, error classification, and GitHub Actions annotation output (::warning::, ::error::, ::notice::).

Error Handling in CI

Budget Exceeded Detection

Terminal window
SUBTYPE=$(echo "$RESULT" | jq -r '.subtype // ""')
if [ "$SUBTYPE" = "error_max_budget_usd" ]; then
echo "::error::Claude exceeded budget cap"
exit 1
fi

The ::error:: prefix is GitHub Actions syntax for surfacing errors in the workflow summary. Replace it with your CI system’s equivalent.

Gotcha

—max-budget-usd is checked between turns, not mid-generation. The first turn can exceed the budget by any amount. For Opus, system prompt cache creation alone costs ~$0.016, so setting the budget below that guarantees an error_max_budget_usd response with no useful output.

Missing API Key

Gotcha

If ANTHROPIC_API_KEY is missing or invalid, the CLI exits with a non-zero code but may not produce JSON output. Your pipeline should check the exit code before attempting to parse JSON, or the jq step will fail silently and the CI job may report success when it actually did nothing.

Hooks in Headless Mode

Tip

SessionStart and SessionStop hooks do not fire in —print mode. Only PreToolUse, PostToolUse, and Stop hooks fire in headless mode. If your audit logging depends on SessionStart, it will silently skip in CI.

Resilience Next Steps

With resilience patterns in place, your CI pipelines can handle transient failures without human intervention. For managing the cost of running these pipelines at scale, see Budget Controls and Cost at Scale.

Now Do This

Add —fallback-model claude-sonnet-4-6 to your CI Claude commands. This handles 429 rate limits automatically — Claude falls back to Sonnet instead of failing. Wrap the call in a retry loop for other error types: for i in 1 2 3; do RESULT=$(claude -p ”…” —output-format json —fallback-model claude-sonnet-4-6) && break; sleep 5; done.