← Back to Blog

Claude Code Cheap API Router Setup (2026 Guide)

2026-04-21·8 min read·CodeRouter Team
claude code api costclaude code cheap routerclaude code alternative modelclaude code opus costclaude code deepseekanthropic api cheap codingANTHROPIC_AUTH_TOKENANTHROPIC_BASE_URLclaude code custom endpointclaude code third-party gateway

TL;DR — Claude Code always calls Anthropic's API with whatever model you've configured. Configure it with model: auto pointed at CodeRouter, and a phase-aware router decides per-call which model runs — keeping Opus for genuine planning / hard debug phases, moving the 70% of bread-and-butter calls to Sonnet 4.6 or DeepSeek V3. Same Claude Code UX, 60–80% lower bill.

Why Claude Code is expensive

Claude Code is purpose-built for long agentic coding sessions — Plan mode, long tool chains, file-read/write, Bash execution. It's probably the most efficient coding agent at output quality per engineered result.

The problem is cost per token:

A typical full Claude Code day with 3–5 planning sessions and 100+ implementation steps easily hits $30–80/day direct to Anthropic. That's $700–2400/month for one heavy user.

How phase-aware routing reduces this

Claude Code's workflow actually maps beautifully onto phase detection:

| Claude Code mode | Phase detected | Router's pick | |---|---|---| | Plan mode (<plan_mode> tag) | plan (confidence 0.95) | Claude Opus 4.7 ← stays | | Tool call: Read followed by user asking a question | plan (multiple file reads = review) | GPT-5.2 / Sonnet 4.6 | | Tool call: Bash → output contains Error: / traceback | debug | Sonnet 4.6 / DeepSeek R1 | | Tool call: Edit / Write with implementation intent | implement | Sonnet 4.6 / DeepSeek V3 | | Tool call: Write to a *_test.py or test_*.ts file | test | DeepSeek V3 / Kimi K2.5 | | Tool call: Write to a .md file | document | Haiku 4.5 / Gemini Flash |

The router uses Claude Code's own system prompt to fingerprint it with 95% confidence, then uses Plan mode tag + tool-result inspection to pick the right phase. No config needed beyond model: auto.

Setup — Claude Code proxy configuration

Two ways to configure Claude Code: env vars (quickest, single-session) or ~/.claude/settings.json (persistent across launches). Both shown below.

Critical: the env-var name trap

Before you copy any setup snippet — the env var that authenticates Claude Code against a third-party gateway is ANTHROPIC_AUTH_TOKEN, not ANTHROPIC_API_KEY.

If you set ANTHROPIC_API_KEY=cr_xxx and run Claude Code:

Why: ANTHROPIC_API_KEY is hard-wired to authenticate against api.anthropic.com. When ANTHROPIC_BASE_URL points elsewhere, only ANTHROPIC_AUTH_TOKEN (Bearer token mode) is honored.

Sanity check: after launching, look at the badge in Claude Code's header. If it doesn't say API Usage Billing, your env var didn't take effect. Run claude logout and try again.

Env vars (one-shot)

# Install Claude Code if you haven't already
$ npm install -g @anthropic-ai/claude-code@latest

# Point Claude Code at CodeRouter — note AUTH_TOKEN, not API_KEY
$ export ANTHROPIC_BASE_URL=https://www.coderouter.io/api/v1
$ export ANTHROPIC_AUTH_TOKEN=cr_your_coderouter_key

# Force every model tier through smart routing. Without these, Claude Code's
# /model picker may try a hardcoded default that doesn't exist on our side.
$ export ANTHROPIC_MODEL=auto
$ export ANTHROPIC_DEFAULT_OPUS_MODEL=auto
$ export ANTHROPIC_DEFAULT_SONNET_MODEL=auto
$ export ANTHROPIC_DEFAULT_HAIKU_MODEL=auto

# v2.1.x had experimental beta features that some third-party gateways
# don't yet implement. Disabling them is harmless and avoids 400 errors.
$ export CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1

$ claude

Persistent — ~/.claude/settings.json

The persistent config lives at ~/.claude/settings.json (note: .claude/ in your home dir, not ~/.config/claude-code/). Format:

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://www.coderouter.io/api/v1",
    "ANTHROPIC_AUTH_TOKEN": "cr_your_coderouter_key",
    "ANTHROPIC_MODEL": "auto",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "auto",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "auto",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "auto",
    "CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS": "1"
  }
}

When you next run claude, it picks up this env block automatically — no shell-level export needed.

When Claude Code connects, it speaks Anthropic's native /v1/messages protocol. CodeRouter translates that into whichever provider phase detection chose (Anthropic for Opus / Sonnet / Haiku calls, OpenAI / DeepSeek / Google for non-Anthropic picks) and translates the response back so Claude Code sees a consistent Anthropic-shaped reply regardless of which upstream actually served it.

Troubleshooting — the four errors you'll likely hit

These are the actual production failures we've seen from real testers in the past 48 hours. If you hit one, the fix is below.

"There's an issue with the selected model (auto). It may not exist..."

This is Claude Code's catch-all error for "model validation failed". Three independent causes can trigger it:

  1. Wrong env var name — see the AUTH_TOKEN section above. Switch from ANTHROPIC_API_KEY to ANTHROPIC_AUTH_TOKEN and confirm the header shows "API Usage Billing".

  2. Experimental betas (v2.1.x only) — Claude Code v2.1.x sends an anthropic-beta header for features that some gateways don't implement. Add:

    export CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1
    

    Or in settings.json, include "CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS": "1" in the env block.

  3. Older Claude Code without the auto model in /model picker — Claude Code v2.1.80 and earlier don't show a "Custom model" option in the /model slash command. Either pass it on the CLI:

    claude --model auto
    

    Or upgrade: npm install -g @anthropic-ai/claude-code@latest.

"Free plan requires your own provider API keys" (HTTP 403)

CodeRouter's Free trial requires Bring-Your-Own-Key (you supply Anthropic / OpenAI keys, we just route). Three options:

  1. Add your own provider key at /dashboard/models — keep using Free with smart routing
  2. Upgrade to a paid plan — Starter from $19/mo, Pro $99/mo gets you 30M tokens
  3. (Internal teams only) Get whitelisted by your org's CodeRouter admin

Claude Code header still shows "Claude Max"

Existing OAuth session is winning over your env vars. Logout first:

claude logout
# then re-run with your env vars set

"Anthropic API error 400: Unable to download the file"

Caused by a base64-encoded image being forwarded to Anthropic as {type: "url"} instead of {type: "base64"}. Already fixed in CodeRouter — if you still hit this, the image MIME type may be unsupported (Anthropic only accepts image/png, image/jpeg, image/gif, image/webp).

Verifying routing is active

After your first Claude Code session:

  1. Open the CodeRouter dashboard → Analytics.
  2. Look at the Phase Distribution chart. You should see a mix — if 100% of requests show as one phase (e.g. all "implement"), detection isn't picking up your agent's patterns and you should check your Claude Code version.
  3. Check the Models Used chart. A healthy distribution: ~10% Opus 4.7, ~40% Sonnet 4.6, ~25% DeepSeek V3, ~15% Haiku 4.5, ~10% others. All-Opus means the phase detector is treating everything as plan — bug, report to us.
  4. Weekly Savings card shows what Opus-for-everything would have cost.

Real-world cost example

Claude Code daily session logs from an internal test (~200 tool calls, 4 hours of use):

| Setup | Cost for that day | |---|---| | Claude Code → Anthropic direct (Opus 4.7 default) | $42.18 | | Claude Code → Anthropic direct (Sonnet 4.6 model flag) | $8.94 | | Claude Code → CodeRouter | $3.81 (blended across phases) |

That's 91% vs. Opus-direct, or 57% vs. the already-optimized Sonnet-direct config. Plan mode calls still hit Opus (as they should), but the 160+ implementation/test/docstring/refactor calls that followed got distributed across cheaper models.

FAQ

Will Plan mode still work? Yes, and that's the whole point — CodeRouter recognizes Plan mode via the <plan_mode> system prompt marker and keeps routing it to Opus. What changes is the post-plan implementation calls that Claude Code makes by the dozen.

What about Claude Code's native MCP tool ecosystem? Unaffected. MCP servers are local; only the LLM calls route through CodeRouter.

Does routing to DeepSeek/Gemini break Claude Code's tool-call format? No — CodeRouter translates Anthropic's tool_use blocks into OpenAI's tool_calls format for non-Anthropic providers and back. Claude Code sees a consistent Anthropic-shaped response regardless of which upstream served it.

What about Claude Code on Bedrock / Vertex? If you're on Bedrock, keep your existing setup — no CodeRouter needed since Bedrock pricing differs. We integrate with native Anthropic API, not with cloud-provider resellers.

Related

Ready to Reduce Your AI API Costs?

CodeRouter routes every API call to the optimal model — automatically. Start saving today.

Get Started Free →

Get weekly AI cost optimization tips

Join 2,000+ developers saving on LLM costs