Qwen

Qwen text capabilities include chat completions and machine translation. They use the OpenAI-compatible Chat Completions path. This page separates model choices and parameter differences by capability.

Endpoint Path

Method	Path	Purpose
POST	`/v1/chat/completions`	Qwen chat completion or machine translation request

curl -X POST "https://api.routescope.ai/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Hello, please introduce yourself in one sentence."
      }
    ]
  }'

curl -X POST "https://api.routescope.ai/v1/chat/completions" \  -H "Content-Type: application/json" \  -d '{    "model": "gpt-4o-mini",    "messages": [      {        "role": "user",        "content": "Hello, please introduce yourself in one sentence."      }    ]  }'

{
  "id": "task_01JZ8M9Q4R7V2K8N9P0Q",
  "object": "string",
  "created": 1,
  "model": "gpt-4o-mini",
  "choices": [],
  "usage": {
    "prompt_tokens": 1,
    "completion_tokens": 1,
    "total_tokens": 1,
    "input_tokens": 1,
    "output_tokens": 1
  }
}

{
  "error": null,
  "message": "success"
}

Authorization

BearerAuth

AuthorizationBearer <token>

Model relay interface recognition. Request heading: Autoration: Bearer .

In: header

Request Body

application/json

model*string

The model name to call.

messages*

Can not open message Can not open message Scope: At least 1 message.

temperature?number

Sampling temperatures spread more and more. Sample temperature. Range: 0 to 2; the larger the value, the more random.

Range0 <= value <= 2

top_p?number

Nuclear sampling parameters. Nuclear sampling parameters. Range: 0 to 1; usually no large adjustments with temperature.

Range0 <= value <= 1

max_tokens?integer

Maximum output number of Tokens. Maximum output number of Tokens. Scope: 1 to the maximum of the context of the model.

Range1 <= value

stream?boolean

Whether to enable SSE flow output. Whether or not to enable flow output. Scope: True or false.

stream_options?

Stream extension options. Upstream support varied.

enable_thinking?boolean

Whether or not to start deep thinking mode. Qwen/Ariyuncrery OpenAI compatible extension parameters: _FD_PROTECT_0 _ Start thinking, _FD_PROTECT_1 _ Close thinking; part of the thinking model is always open and does not support closure. Python OpenAI SDK can be imported through FD_PROTECT_2.

tools?

Tool definitions for which models can be called. . Scope: The length of arrays and the complexity of schema are based on upstream limits.

tool_choice?string|

Tools call policies, such as _FD_PROTEC_0, FD_PROTEC_1 or visible specifying functions. Tool call policy. Scope: auto, none, required or visible tool objects.

response_format?object

Structured output constraints, such as JSON Schema.

user?string

End-user identification for audit and control.

Response Body

application/json

Model Selection

Model ID	Capability	Typical Use
`qwen2.5-14b-instruct`	Chat completions	General instruction following and conversation
`qwen3.5-397b-a17b`	Chat completions	More complex text tasks
`qwen3.5-flash`	Chat completions	Lightweight or low-latency conversation
`qwen3.5-plus`	Chat completions	General enhanced conversation
`qwen3-max-preview`	Chat completions	Preview model
`qwen-mt-flash`	Machine translation	Low-latency translation
`qwen-mt-plus`	Machine translation	General translation
`qwen-mt-turbo`	Machine translation	High-throughput translation

Common Parameters

Field	Type	Required	Description
`model`	string	Yes	Qwen model ID.
`messages`	array	Yes	OpenAI-style conversation messages. Translation tasks can place source text and requirements in messages.
`temperature`	number	No	Sampling temperature.
`top_p`	number	No	Nucleus sampling parameter.
`max_tokens`	integer	No	Maximum output tokens.
`stream`	boolean	No	Whether to stream output.
`tools`	array	No	Tool definitions.
`response_format`	object	No	Structured output constraint.
`user`	string	No	End-user identifier.

Model-Specific Notes

Field	Applicable Models	Description
`model`	`All Qwen models`	Pass the concrete model ID in the request body.
`enable_thinking`	`Qwen / Alibaba Cloud OpenAI-compatible extension`	The OpenAPI schema notes: true enables thinking and false disables thinking. Some thinking-only models always enable it and do not support disabling. Python OpenAI SDK users can pass it through extra_body.
`Translation prompt structure`	`qwen-mt-flash, qwen-mt-plus, qwen-mt-turbo`	These models are intended for machine translation, but the endpoint remains Chat Completions-compatible. Express translation requirements through messages.

Endpoint Path

Authorization

Request Body

Response Body

200application/json

400application/json

Model Selection

Common Parameters

Model-Specific Notes

application/json

application/json