TL;DR — Claude Code always calls Anthropic's API with whatever model you've configured. Configure it with
model: autopointed at CodeRouter, and a phase-aware router decides per-call which model runs — keeping Opus for genuine planning / hard debug phases, moving the 70% of bread-and-butter calls to Sonnet 4.6 or DeepSeek V3. Same Claude Code UX, 60–80% lower bill.
Why Claude Code is expensive
Claude Code is purpose-built for long agentic coding sessions — Plan mode, long tool chains, file-read/write, Bash execution. It's probably the most efficient coding agent at output quality per engineered result.
The problem is cost per token:
- Claude Code defaults to Opus 4.7 — $15 input / $75 output per 1M tokens.
- Agentic tool use means every step reships the growing context. A 50-step session with 80K accumulated context = ~$4M of context tokens re-processed at Opus rates.
- Plan mode (super useful) runs on Opus too. Fine for the planning call. Wasteful for the subsequent implementation calls that follow the plan.
A typical full Claude Code day with 3–5 planning sessions and 100+ implementation steps easily hits $30–80/day direct to Anthropic. That's $700–2400/month for one heavy user.
How phase-aware routing reduces this
Claude Code's workflow actually maps beautifully onto phase detection:
| Claude Code mode | Phase detected | Router's pick |
|---|---|---|
| Plan mode (<plan_mode> tag) | plan (confidence 0.95) | Claude Opus 4.7 ← stays |
| Tool call: Read followed by user asking a question | plan (multiple file reads = review) | GPT-5.2 / Sonnet 4.6 |
| Tool call: Bash → output contains Error: / traceback | debug | Sonnet 4.6 / DeepSeek R1 |
| Tool call: Edit / Write with implementation intent | implement | Sonnet 4.6 / DeepSeek V3 |
| Tool call: Write to a *_test.py or test_*.ts file | test | DeepSeek V3 / Kimi K2.5 |
| Tool call: Write to a .md file | document | Haiku 4.5 / Gemini Flash |
The router uses Claude Code's own system prompt to fingerprint it with 95% confidence, then uses Plan mode tag + tool-result inspection to pick the right phase. No config needed beyond model: auto.
Setup — Claude Code proxy configuration
Two ways to configure Claude Code: env vars (quickest, single-session) or ~/.claude/settings.json (persistent across launches). Both shown below.
Critical: the env-var name trap
Before you copy any setup snippet — the env var that authenticates Claude Code against a third-party gateway is ANTHROPIC_AUTH_TOKEN, not ANTHROPIC_API_KEY.
If you set ANTHROPIC_API_KEY=cr_xxx and run Claude Code:
- The env var is silently ignored
- Claude Code falls back to your existing OAuth session
- The header at the top shows Claude Max instead of API Usage Billing
- All requests still go to api.anthropic.com directly — bypassing your gateway entirely
Why: ANTHROPIC_API_KEY is hard-wired to authenticate against api.anthropic.com. When ANTHROPIC_BASE_URL points elsewhere, only ANTHROPIC_AUTH_TOKEN (Bearer token mode) is honored.
Sanity check: after launching, look at the badge in Claude Code's header. If it doesn't say API Usage Billing, your env var didn't take effect. Run claude logout and try again.
Env vars (one-shot)
# Install Claude Code if you haven't already
$ npm install -g @anthropic-ai/claude-code@latest
# Point Claude Code at CodeRouter — note AUTH_TOKEN, not API_KEY
$ export ANTHROPIC_BASE_URL=https://www.coderouter.io/api/v1
$ export ANTHROPIC_AUTH_TOKEN=cr_your_coderouter_key
# Force every model tier through smart routing. Without these, Claude Code's
# /model picker may try a hardcoded default that doesn't exist on our side.
$ export ANTHROPIC_MODEL=auto
$ export ANTHROPIC_DEFAULT_OPUS_MODEL=auto
$ export ANTHROPIC_DEFAULT_SONNET_MODEL=auto
$ export ANTHROPIC_DEFAULT_HAIKU_MODEL=auto
# v2.1.x had experimental beta features that some third-party gateways
# don't yet implement. Disabling them is harmless and avoids 400 errors.
$ export CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1
$ claude
Persistent — ~/.claude/settings.json
The persistent config lives at ~/.claude/settings.json (note: .claude/ in your home dir, not ~/.config/claude-code/). Format:
{
"env": {
"ANTHROPIC_BASE_URL": "https://www.coderouter.io/api/v1",
"ANTHROPIC_AUTH_TOKEN": "cr_your_coderouter_key",
"ANTHROPIC_MODEL": "auto",
"ANTHROPIC_DEFAULT_OPUS_MODEL": "auto",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "auto",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "auto",
"CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS": "1"
}
}
When you next run claude, it picks up this env block automatically — no shell-level export needed.
When Claude Code connects, it speaks Anthropic's native /v1/messages protocol. CodeRouter translates that into whichever provider phase detection chose (Anthropic for Opus / Sonnet / Haiku calls, OpenAI / DeepSeek / Google for non-Anthropic picks) and translates the response back so Claude Code sees a consistent Anthropic-shaped reply regardless of which upstream actually served it.
Troubleshooting — the four errors you'll likely hit
These are the actual production failures we've seen from real testers in the past 48 hours. If you hit one, the fix is below.
"There's an issue with the selected model (auto). It may not exist..."
This is Claude Code's catch-all error for "model validation failed". Three independent causes can trigger it:
-
Wrong env var name — see the AUTH_TOKEN section above. Switch from
ANTHROPIC_API_KEYtoANTHROPIC_AUTH_TOKENand confirm the header shows "API Usage Billing". -
Experimental betas (v2.1.x only) — Claude Code v2.1.x sends an
anthropic-betaheader for features that some gateways don't implement. Add:export CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1Or in
settings.json, include"CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS": "1"in theenvblock. -
Older Claude Code without the
automodel in/modelpicker — Claude Code v2.1.80 and earlier don't show a "Custom model" option in the/modelslash command. Either pass it on the CLI:claude --model autoOr upgrade:
npm install -g @anthropic-ai/claude-code@latest.
"Free plan requires your own provider API keys" (HTTP 403)
CodeRouter's Free trial requires Bring-Your-Own-Key (you supply Anthropic / OpenAI keys, we just route). Three options:
- Add your own provider key at /dashboard/models — keep using Free with smart routing
- Upgrade to a paid plan — Starter from $19/mo, Pro $99/mo gets you 30M tokens
- (Internal teams only) Get whitelisted by your org's CodeRouter admin
Claude Code header still shows "Claude Max"
Existing OAuth session is winning over your env vars. Logout first:
claude logout
# then re-run with your env vars set
"Anthropic API error 400: Unable to download the file"
Caused by a base64-encoded image being forwarded to Anthropic as {type: "url"} instead of {type: "base64"}. Already fixed in CodeRouter — if you still hit this, the image MIME type may be unsupported (Anthropic only accepts image/png, image/jpeg, image/gif, image/webp).
Verifying routing is active
After your first Claude Code session:
- Open the CodeRouter dashboard → Analytics.
- Look at the Phase Distribution chart. You should see a mix — if 100% of requests show as one phase (e.g. all "implement"), detection isn't picking up your agent's patterns and you should check your Claude Code version.
- Check the Models Used chart. A healthy distribution: ~10% Opus 4.7, ~40% Sonnet 4.6, ~25% DeepSeek V3, ~15% Haiku 4.5, ~10% others. All-Opus means the phase detector is treating everything as
plan— bug, report to us. - Weekly Savings card shows what Opus-for-everything would have cost.
Real-world cost example
Claude Code daily session logs from an internal test (~200 tool calls, 4 hours of use):
| Setup | Cost for that day | |---|---| | Claude Code → Anthropic direct (Opus 4.7 default) | $42.18 | | Claude Code → Anthropic direct (Sonnet 4.6 model flag) | $8.94 | | Claude Code → CodeRouter | $3.81 (blended across phases) |
That's 91% vs. Opus-direct, or 57% vs. the already-optimized Sonnet-direct config. Plan mode calls still hit Opus (as they should), but the 160+ implementation/test/docstring/refactor calls that followed got distributed across cheaper models.
FAQ
Will Plan mode still work?
Yes, and that's the whole point — CodeRouter recognizes Plan mode via the <plan_mode> system prompt marker and keeps routing it to Opus. What changes is the post-plan implementation calls that Claude Code makes by the dozen.
What about Claude Code's native MCP tool ecosystem? Unaffected. MCP servers are local; only the LLM calls route through CodeRouter.
Does routing to DeepSeek/Gemini break Claude Code's tool-call format?
No — CodeRouter translates Anthropic's tool_use blocks into OpenAI's tool_calls format for non-Anthropic providers and back. Claude Code sees a consistent Anthropic-shaped response regardless of which upstream served it.
What about Claude Code on Bedrock / Vertex? If you're on Bedrock, keep your existing setup — no CodeRouter needed since Bedrock pricing differs. We integrate with native Anthropic API, not with cloud-provider resellers.
Related
- Full integration docs for every agent — Cursor, Aider, Cline, Continue, Windsurf, OpenClaw, Copilot via LiteLLM, raw SDKs
- How to Cut Your Cursor Bill by 70–90%
- Aider Cost Optimization 2026
- Phase-Aware LLM Routing Explained
- CodeRouter vs OpenRouter