← Back to Blog

Fix: DeepSeek 400 Error — "reasoning_content in thinking mode must be passed back"

2026-07-05·5 min read·CodeRouter Team
deepseek reasoning_content 400 errorreasoning_content must be passed backdeepseek thinking mode tool calls errordeepseek v4 400 invalid requestdeepseek api tool_calls reasoning_content fix

TL;DR — DeepSeek's thinking-mode models (like DeepSeek V4 with thinking enabled) return each response with a reasoning_content field next to content. Once the model makes a tool call, the API requires that reasoning_content be sent back on every assistant message in the conversation history. Most OpenAI-compatible clients silently strip unknown fields when they rebuild history — the next request then fails with HTTP 400: The "reasoning_content" in the thinking mode must be passed back to the API. Fix it by preserving reasoning_content in replayed assistant messages, routing through a proxy that round-trips it for you, or disabling thinking mode for tool-heavy workloads.

The error

You call a DeepSeek thinking model through the OpenAI-compatible endpoint, the first turn works, the model makes a tool call — and the second request fails:

HTTP 400
{
  "error": {
    "message": "The \"reasoning_content\" in the thinking mode must be passed back to the API...",
    "type": "invalid_request_error"
  }
}

The confusing part: your code didn't change between turn one and turn two. The conversation state did.

Why it happens

In thinking mode, DeepSeek returns the model's chain-of-thought in a separate reasoning_content field, at the same level as content:

{
  "role": "assistant",
  "content": "...",
  "reasoning_content": "Let me check the file first...",
  "tool_calls": [ ... ]
}

Per DeepSeek's thinking-mode documentation, when the model performs a tool call, the intermediate assistant message's reasoning_content must participate in context concatenation — i.e., you must send it back verbatim in subsequent turns.

The problem is that reasoning_content is not part of the standard OpenAI chat schema. Most SDKs and agent frameworks rebuild message history through a converter that keeps only the fields it knows (role, content, tool_calls, tool_call_id). The reasoning field gets dropped, and DeepSeek rejects the replayed history.

This bites real tools, not just hand-rolled scripts — the same 400 has been reported in opencode, claude-code-router, and n8n's AI agent nodes, all for the same reason: the history converter strips the field.

Fix 1: Preserve reasoning_content in your client

If you control the request-building code, keep the field on every assistant message you replay:

# When appending the model's response to history, do NOT rebuild the
# message from scratch — carry the raw fields through.
msg = response.choices[0].message
history.append({
    "role": "assistant",
    "content": msg.content,
    "reasoning_content": getattr(msg, "reasoning_content", None),
    "tool_calls": [tc.model_dump() for tc in (msg.tool_calls or [])],
})

Two details that matter:

If you use the OpenAI Python SDK, the extra field survives as long as you don't round-trip messages through strict Pydantic models that drop unknown keys.

Fix 2: Let the router handle the round-trip

If you'd rather not patch every client, put a router between your agent and DeepSeek. CodeRouter stores the raw provider response per turn and re-attaches reasoning_content when your client's replayed history is missing it, so unmodified OpenAI-compatible clients (Cursor with a custom base URL, Aider, plain SDK code) work with DeepSeek thinking models out of the box:

client = OpenAI(
    base_url="https://www.coderouter.io/api/v1",
    api_key="<your coderouter key>",
)

This is also the practical answer for tools you can't patch (closed-source IDE plugins, hosted agents).

Fix 3: Disable thinking mode for tool-heavy workloads

If the reasoning tokens aren't buying you quality on your workload, turn thinking off and the constraint disappears entirely — assistant messages then carry no reasoning_content and standard OpenAI clients replay history cleanly. On the DeepSeek API this is controlled per-request (see their thinking-mode docs for the current parameter shape; older releases used separate -reasoner model names).

For coding agents specifically, a common pattern is: thinking mode for the planning step, non-thinking for mechanical multi-tool execution loops. That is exactly the phase-aware split CodeRouter automates.

How to verify the fix

Send a two-turn conversation where turn one forces a tool call, then check that turn two succeeds:

# Turn 2 request body must contain BOTH fields on the assistant message:
"messages": [
  {"role": "user", "content": "What files changed?"},
  {"role": "assistant", "content": "", "reasoning_content": "...", "tool_calls": [...]},
  {"role": "tool", "tool_call_id": "call_1", "content": "src/app.py"}
]

If the request logs show the assistant message without reasoning_content, your converter is still stripping it.

FAQ

Does this affect non-thinking DeepSeek models?

No. Only thinking-mode responses carry reasoning_content, and only they enforce the round-trip requirement. Standard chat models replay fine with the plain OpenAI schema.

Why does it only fail after tool calls?

Without tool calls there's usually no multi-turn assistant history to replay inside one logical exchange. The requirement is documented specifically for the tool-call case: the intermediate assistant turn (the one that decided to call the tool) must keep its reasoning attached when you send the tool result back.

Do OpenRouter or other gateways fix this automatically?

Not reliably — a gateway that just proxies your request forwards whatever history your client built, so if the client stripped the field, the 400 still happens. The fix has to happen where history is rebuilt: in your client (Fix 1) or in a router that reconstructs it (Fix 2).


Sources: DeepSeek Thinking Mode docs, opencode issue #24104, claude-code-router issue #1378.

Ready to Reduce Your AI API Costs?

CodeRouter routes every API call to the optimal model — automatically. Start saving today.

Get Started Free →

Get weekly AI cost optimization tips

Join 2,000+ developers saving on LLM costs