How do I reduce my Cursor API bill?

Cursor Pro's $20/mo covers 500 fast requests — past that you pay OpenAI / Anthropic per-call rates directly, which is why heavy users end up at $50-$200/mo. The fix is to point Cursor's Custom API at CodeRouter and set model to 'auto'. CodeRouter detects what phase of coding the request is (planning, implementation, debugging, test generation, docs) and routes to the cheapest capable model per phase — Opus only for planning, DeepSeek V3 for test generation, Haiku for docstrings. Same Cursor IDE, same keyboard shortcuts, 70-90% lower monthly bill. Setup takes 2 minutes — just change base_url to https://www.coderouter.io/api/v1 and paste your cr_ API key.

What is the cheapest API for Claude Code / Aider / Copilot?

There isn't a single 'cheapest API' — the cheapest model depends on what the coding agent is doing. For planning and architecture, you still want Claude Opus 4.7 or Sonnet 4.6. For implementation, DeepSeek V3.2 ($0.28/$0.42 per 1M) and Qwen 3 Coder are 30-50x cheaper than Opus with near-equivalent code quality. For test generation, DeepSeek V3 or Haiku 4.5 is 15-50x cheaper. For docstrings and simple formatting, Haiku 4.5 ($1/$5) or Gemini 2.5 Flash ($0.30/M output) is 15-250x cheaper. CodeRouter is the gateway that picks per request automatically — aim a single base_url at https://www.coderouter.io/api/v1 from Claude Code, Aider, Copilot (via LiteLLM), Cursor, Windsurf, or any OpenAI-compatible agent.

Does DeepSeek V3 work as well as Claude Sonnet for coding?

For implementation and test generation phases — yes, DeepSeek V3.2 matches Claude Sonnet 4.6 on HumanEval, MBPP, and LiveCodeBench within 1-3 points. For multi-file refactoring and architecture planning, Sonnet still has an edge on long-context reasoning. DeepSeek V3 costs $0.28 input / $0.42 output per 1M tokens, vs Sonnet's $3/$15 — roughly 30-50x cheaper. The right answer for most coding agents is not 'pick one forever' but 'use DeepSeek V3 for the implement/test phases and Sonnet for the plan/refactor phases.' That's phase-aware routing in practice — CodeRouter decides per request in ~10ms.

What is phase-aware LLM routing?

Phase-aware LLM routing classifies each coding-agent request by what phase of software work it represents — planning, implementation, debugging, testing, refactoring, or documentation — and routes it to the cheapest model that can handle that specific phase. A 'write unit tests for this function' request goes to DeepSeek V3 ($0.42/M). A 'refactor this multi-file feature and plan the migration' request goes to Claude Opus 4.7 ($75/M). This is different from picking one model for everything, and different from OpenRouter-style model-selection (which still requires you to choose manually). CodeRouter's classifier runs in ~10ms on the server, so the agent never notices the extra hop.

CodeRouter vs OpenRouter — which saves more money on coding?

OpenRouter is a model marketplace — it gives you access to 300+ models behind one API key, but you still pick which model to send each request to. Most Cursor / Aider / Claude Code users default to the premium model (Opus, GPT-5) for everything and end up paying full price. CodeRouter is a phase-aware router — set model to 'auto' and we pick the cheapest capable model per request based on the coding phase. CodeRouter also adds things OpenRouter doesn't: coding-specific capability scores per model (implementation, debug, test, refactor), per-end-user attribution for SaaS agent builders, and built-in quota + top-up billing. For pure coding workloads, typical CodeRouter savings are 70-90% vs picking one model on OpenRouter.

Will CodeRouter break my Cursor / Aider / Claude Code agent?

No. CodeRouter exposes a standard OpenAI-compatible chat completions endpoint (POST /api/v1/chat/completions) with the same request and response format your agent already uses — including streaming, tool use, and function calling. We implement the same JSON schema and stream format, so Cursor, Aider, Claude Code, Cline, Continue.dev, Windsurf, OpenClaw, and any LiteLLM-wrapped client work unmodified. If a routed model fails, the fallback chain tries up to 2 alternates automatically (on 429, 500-504, timeouts, missing keys). You can also pin an explicit model instead of 'auto' any time.

How do I set up CodeRouter with Cursor in 2 minutes?

1) Sign up free at https://www.coderouter.io/login and copy your API key (starts with cr_). 2) In Cursor, open Settings -> Models -> OpenAI API Key, and under 'Override OpenAI Base URL' paste https://www.coderouter.io/api/v1. Paste your cr_ key in the API Key field. 3) Add 'auto' to the Custom Models list and select it as your active model. That's it — phase-aware routing is live. Aider users set OPENAI_API_BASE and OPENAI_API_KEY env vars to the same values. Claude Code users set ANTHROPIC_BASE_URL to https://www.coderouter.io/api/v1 and ANTHROPIC_API_KEY to the cr_ key. Full guide at https://www.coderouter.io/setup.

Integration Guide

CodeRouter exposes both an OpenAI-compatible endpoint (/v1/chat/completions) and an Anthropic-compatible endpoint (/v1/messages). Every major coding agent talks one of these protocols — usually you only need to override one base-URL env var. Below is the per-agent recipe with the gotchas we've hit in production.

Claude Code

Anthropic's official CLI · Anthropic protocol

Claude Code v2.x speaks the Anthropic Messages API. Point it at CodeRouter's /api/v1 base and use auto as the model.

⚠ Critical:Use ANTHROPIC_AUTH_TOKEN, NOT ANTHROPIC_API_KEY. The latter only works against api.anthropic.com directly. When pointed at a third-party gateway, Claude Code silently ignores ANTHROPIC_API_KEY and falls back to your existing OAuth session — the header will still show "Claude Max" instead of "API Usage Billing".

Quickest setup — env vars

$ in your shell

export ANTHROPIC_BASE_URL=https://www.coderouter.io/api/v1
export ANTHROPIC_AUTH_TOKEN=cr_YOUR_KEY_HERE
export ANTHROPIC_MODEL=auto
export ANTHROPIC_DEFAULT_OPUS_MODEL=auto
export ANTHROPIC_DEFAULT_SONNET_MODEL=auto
export ANTHROPIC_DEFAULT_HAIKU_MODEL=auto

# v2.x experimental beta features can break third-party gateways.
# This env disables them; harmless to leave on permanently.
export CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1

claude

Persistent — ~/.claude/settings.json

~/.claude/settings.json

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://www.coderouter.io/api/v1",
    "ANTHROPIC_AUTH_TOKEN": "cr_YOUR_KEY_HERE",
    "ANTHROPIC_MODEL": "auto",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "auto",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "auto",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "auto",
    "CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS": "1"
  }
}

Verify it's routing through us

After launch, the Claude Code header should show API Usage Billing — not Claude Max. If you still see Claude Max, your existing OAuth session is winning over the env vars:

claude logout       # clear OAuth session
# then re-launch with the env vars set

Older versions (v2.1.x):If you can't add custom model names via /model command, pass --model auto on the command line: claude --model auto. Or upgrade with npm i -g @anthropic-ai/claude-code@latest.

Cursor

Cursor IDE · OpenAI-compatible protocol

Cursor whitelists model names by string — you must add auto as a custom model before it appears in the picker. Setup is UI-only, no env vars.

Open Cursor, press Cmd+Shift+P (or Ctrl+Shift+P) → type "Cursor Settings"
Click Models in the left sidebar
Find the OpenAI API Key section, toggle "Override OpenAI Base URL" ON, paste:
- URL: https://www.coderouter.io/api/v1
- Key: cr_YOUR_KEY_HERE
Click Verify — must show ✓ green
Scroll up to Models, click + Add custom model, type auto
Tick the auto checkbox to enable it
In the chat model dropdown (top right of chat), pick auto

Composer + Cmd+K work too:Cursor uses the same model setting for chat, Composer, and inline edits. Once auto is enabled, every Cursor feature routes through us.

Verify failing with "invalid model"?Cursor tests with its default model first — that's fine, the check only confirms auth. After you add auto as a custom model, real requests work.

Aider

CLI · architect mode · OpenAI-compatible protocol

Aider has a hardcoded model whitelist. The openai/ prefix bypasses the validator and routes through whatever endpoint OPENAI_API_BASE points to.

Env vars + CLI

export OPENAI_API_BASE=https://www.coderouter.io/api/v1
export OPENAI_API_KEY=cr_YOUR_KEY_HERE

# Single-model mode
aider --model openai/auto

# Architect mode — both planner + editor route through CodeRouter
# (phase detector picks Opus for planning, DeepSeek for diff apply)
aider --model openai/auto --architect

Persistent — ~/.aider.conf.yml

~/.aider.conf.yml

model: openai/auto
openai-api-base: https://www.coderouter.io/api/v1
openai-api-key: cr_YOUR_KEY_HERE
# Optional: pin editor model for architect mode
# editor-model: openai/deepseek-chat

Aider's --architect is the killer combo:Aider in architect mode makes 2 distinct API calls per turn — one planning, one diff-applying. Our phase detector picks them up automatically and routes the planner to Opus and the editor to DeepSeek V3. Pure win, zero config beyond --architect.

Cline

VS Code agent extension · OpenAI-compatible protocol

Cline (formerly Claude Dev) supports custom OpenAI-compatible endpoints out of the box. Configure via the extension settings panel:

VS Code → install Cline extension
Open Cline panel → click ⚙ Settings icon
API Provider: OpenAI Compatible
Base URL: https://www.coderouter.io/api/v1
API Key: cr_YOUR_KEY_HERE
Model ID: auto

Or via settings.json

{
  "cline.apiProvider": "openai",
  "cline.openAiBaseUrl": "https://www.coderouter.io/api/v1",
  "cline.openAiApiKey": "cr_YOUR_KEY_HERE",
  "cline.openAiModelId": "auto"
}

Continue.dev

VS Code + JetBrains extension · OpenAI-compatible protocol

Continue uses ~/.continue/config.json for provider config. Add two entries: one for chat (using auto) and one for tab autocomplete (pinned to a fast model).

~/.continue/config.json

{
  "models": [
    {
      "title": "CodeRouter (auto)",
      "provider": "openai",
      "model": "auto",
      "apiBase": "https://www.coderouter.io/api/v1",
      "apiKey": "cr_YOUR_KEY_HERE"
    }
  ],
  "tabAutocompleteModel": {
    "title": "CodeRouter Tab",
    "provider": "openai",
    "model": "gpt-5-mini",
    "apiBase": "https://www.coderouter.io/api/v1",
    "apiKey": "cr_YOUR_KEY_HERE"
  }
}

Don't use auto for tab autocomplete:Tab autocomplete fires on every keystroke. Routing decisions add ~50ms of overhead — pin tab to gpt-5-mini directly for sub-100ms first token. Use auto only for chat / agent calls.

Windsurf

Codeium IDE · OpenAI-compatible protocol

Windsurf supports custom OpenAI-compatible providers via Settings → Cascade → Custom Models. Same pattern as Cursor: add auto as a custom model first, then select it.

Windsurf → Settings → Cascade
Provider: Custom OpenAI
Base URL: https://www.coderouter.io/api/v1
API Key: cr_YOUR_KEY_HERE
Custom Models → add auto
Active model → select auto

OpenClaw

Open-source CLI agent · OpenAI-compatible protocol

OpenClaw uses ~/.openclaw/openclaw.json for provider configuration. The fastest setup uses our auto-detect script:

One-line setup script

curl -fsSL https://www.coderouter.io/setup.sh | bash -s -- cr_YOUR_KEY_HERE

Manual — ~/.openclaw/openclaw.json

{
  "models": {
    "providers": {
      "coderouter": {
        "baseUrl": "https://www.coderouter.io/api/v1",
        "apiKey": "cr_YOUR_KEY_HERE",
        "api": "openai-completions",
        "models": [{ "id": "auto", "name": "CodeRouter Auto" }]
      }
    }
  },
  "agents": {
    "defaults": {
      "model": { "primary": "coderouter/auto" }
    }
  }
}

GitHub Copilot

via LiteLLM gateway · indirect setup

GitHub Copilot doesn't support custom endpoints natively. The proven pattern is to run LiteLLM as a local proxy that translates Copilot's requests into our OpenAI-compatible format.

Install LiteLLM: pip install "litellm[proxy]"
Create a config pointing LiteLLM at CodeRouter (single openai-compatible model named auto)
Run litellm --config config.yaml --port 4000
Point Copilot at http://localhost:4000 via VSCode setting github.copilot.advanced.debug.overrideProxyUrl

Detailed walkthrough in our blog: Cut Copilot bills with CodeRouter.

OpenAI SDK (Python / Node / Go)

Drop-in replacement for any openai client

Just override base_url and api_key. Set model="auto" to enable phase routing.

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://www.coderouter.io/api/v1",
    api_key="cr_YOUR_KEY_HERE",
)
resp = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Refactor this function"}],
)
print(resp.choices[0].message.content)
print(resp._clawrouters)
# {request_id, model_used, task_type, complexity, cost, ...}

Node.js

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://www.coderouter.io/api/v1",
  apiKey: process.env.CODEROUTER_KEY,
});

const resp = await client.chat.completions.create({
  model: "auto",
  messages: [{ role: "user", content: "Write a Python quicksort" }],
});
console.log(resp.choices[0].message.content);

Anthropic SDK

Native Anthropic Messages format

If your code is already using @anthropic-ai/sdk or anthropic (Python), you can point it at our endpoint and we'll route via the /v1/messages protocol.

Python

from anthropic import Anthropic

client = Anthropic(
    base_url="https://www.coderouter.io/api/v1",
    auth_token="cr_YOUR_KEY_HERE",  # NOT api_key — use auth_token for third-party gateways
)
msg = client.messages.create(
    model="auto",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
)
print(msg.content[0].text)

Verify your setup

Sanity checks that don't cost tokens

Use the X-Dry-Run: true header to get a routing decision back without actually invoking a model:

curl https://www.coderouter.io/api/v1/chat/completions \
  -H "Authorization: Bearer cr_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Dry-Run: true" \
  -d '{"model":"auto","messages":[{"role":"user","content":"Refactor this Python function"}]}'

Response should include the chosen model, detected phase, and cost estimate:

{
  "dryRun": true,
  "routing": {
    "taskType": "coding",
    "complexity": "low",
    "modelSelected": "deepseek-chat",
    "strategy": "balanced",
    "fallbackChain": ["claude-sonnet-4.6", "gemini-3-pro"],
    "estimatedCost": { "selected": "$0.0002", "ifOpus": "$0.0377", "savings": "99.4%" }
  }
}

For a real call (small request, real tokens):

curl https://www.coderouter.io/api/v1/chat/completions \
  -H "Authorization: Bearer cr_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"auto","messages":[{"role":"user","content":"Hello"}]}'

Response headers carry the routing trace — X-CodeRouter-Model, X-CodeRouter-Phase, X-CodeRouter-Cost, X-CodeRouter-Savings — so you can confirm phase routing is active even from a black-box agent.

Troubleshooting

Common errors and fixes

"There's an issue with the selected model (auto)"

Claude Code only. Three causes (in order of likelihood):

Using ANTHROPIC_API_KEY instead of ANTHROPIC_AUTH_TOKEN — switch env var name and check the badge reads API Usage Billing, not Claude Max.
Old Claude Code v2.1.x with new beta features — add CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1 to env.
Wrong base URL shape causing double /v1 on requests. We accept both https://www.coderouter.io and https://www.coderouter.io/api/v1 via internal rewrites.

"Free plan requires your own provider API keys" (HTTP 403)

Free trial requires BYOK — either upgrade to a paid plan at /pricing, or add your own provider key at /dashboard/models.

"Rate limit exceeded" (HTTP 429)

Per-key + per-user rate limits. Defaults: Free 60, Starter 100, Solo 200, Pro 600, Studio 1200, Team 2400 req/min. Check X-RateLimit-Reset response header for retry timing.

"Invalid max_tokens value" from DeepSeek

Should not happen — we clamp max_tokens to 8192 (DeepSeek's cap) before forwarding. If you hit this, your client is sending a value our clamp doesn't cover; report it via issues.

"Unable to download the file" from Anthropic

Sending an image as a data URL. Should be auto-converted to base64 by our Anthropic adapter; if you still hit this, the image MIME type is unrecognized (only image/png, image/jpeg, image/gif, image/webp are supported by Anthropic).

Anything else

Email support@coderouter.io with:

The exact error message + HTTP status code
Your agent name + version
The full curl reproduction (omit your cr_ key)