Routescope APIRoutescope API
Chat Completions

Qwen

Qwen Chat Completions and machine translation model overview

Qwen text capabilities include chat completions and machine translation. They use the OpenAI-compatible Chat Completions path. This page separates model choices and parameter differences by capability.

Endpoint Path

MethodPathPurpose
POST/v1/chat/completionsQwen chat completion or machine translation request
POST
/v1/chat/completions
curl -X POST "https://api.routescope.ai/v1/chat/completions" \  -H "Content-Type: application/json" \  -d '{    "model": "gpt-4o-mini",    "messages": [      {        "role": "user",        "content": "Hello, please introduce yourself in one sentence."      }    ]  }'
{
  "id": "task_01JZ8M9Q4R7V2K8N9P0Q",
  "object": "string",
  "created": 1,
  "model": "gpt-4o-mini",
  "choices": [],
  "usage": {
    "prompt_tokens": 1,
    "completion_tokens": 1,
    "total_tokens": 1,
    "input_tokens": 1,
    "output_tokens": 1
  }
}
{
  "error": null,
  "message": "success"
}

Authorization

BearerAuth

AuthorizationBearer <token>

Model relay interface recognition. Request heading: Autoration: Bearer .

In: header

Request Body

application/json

model*string

The model name to call.

messages*

Can not open message Can not open message Scope: At least 1 message.

temperature?number

Sampling temperatures spread more and more. Sample temperature. Range: 0 to 2; the larger the value, the more random.

Range0 <= value <= 2
top_p?number

Nuclear sampling parameters. Nuclear sampling parameters. Range: 0 to 1; usually no large adjustments with temperature.

Range0 <= value <= 1
max_tokens?integer

Maximum output number of Tokens. Maximum output number of Tokens. Scope: 1 to the maximum of the context of the model.

Range1 <= value
stream?boolean

Whether to enable SSE flow output. Whether or not to enable flow output. Scope: True or false.

stream_options?

Stream extension options. Upstream support varied.

enable_thinking?boolean

Whether or not to start deep thinking mode. Qwen/Ariyuncrery OpenAI compatible extension parameters: _FD_PROTECT_0 _ Start thinking, _FD_PROTECT_1 _ Close thinking; part of the thinking model is always open and does not support closure. Python OpenAI SDK can be imported through FD_PROTECT_2.

tools?

Tool definitions for which models can be called. . Scope: The length of arrays and the complexity of schema are based on upstream limits.

tool_choice?string|

Tools call policies, such as _FD_PROTEC_0, FD_PROTEC_1 or visible specifying functions. Tool call policy. Scope: auto, none, required or visible tool objects.

response_format?object

Structured output constraints, such as JSON Schema.

user?string

End-user identification for audit and control.

Response Body

application/json

application/json

Model Selection

Model IDCapabilityTypical Use
qwen2.5-14b-instructChat completionsGeneral instruction following and conversation
qwen3.5-397b-a17bChat completionsMore complex text tasks
qwen3.5-flashChat completionsLightweight or low-latency conversation
qwen3.5-plusChat completionsGeneral enhanced conversation
qwen3-max-previewChat completionsPreview model
qwen-mt-flashMachine translationLow-latency translation
qwen-mt-plusMachine translationGeneral translation
qwen-mt-turboMachine translationHigh-throughput translation

Common Parameters

FieldTypeRequiredDescription
modelstringYesQwen model ID.
messagesarrayYesOpenAI-style conversation messages. Translation tasks can place source text and requirements in messages.
temperaturenumberNoSampling temperature.
top_pnumberNoNucleus sampling parameter.
max_tokensintegerNoMaximum output tokens.
streambooleanNoWhether to stream output.
toolsarrayNoTool definitions.
response_formatobjectNoStructured output constraint.
userstringNoEnd-user identifier.

Model-Specific Notes

FieldApplicable ModelsDescription
modelAll Qwen modelsPass the concrete model ID in the request body.
enable_thinkingQwen / Alibaba Cloud OpenAI-compatible extensionThe OpenAPI schema notes: true enables thinking and false disables thinking. Some thinking-only models always enable it and do not support disabling. Python OpenAI SDK users can pass it through extra_body.
Translation prompt structureqwen-mt-flash, qwen-mt-plus, qwen-mt-turboThese models are intended for machine translation, but the endpoint remains Chat Completions-compatible. Express translation requirements through messages.

How is this guide?

Last updated on