Qwen-MT covers 92 languages at $0.5 per million tokens using a lightweight MoE architecture

Alibaba has updated its machine translation model with qwen-mt-turbo, a new version released via the Qwen API. The update is built on Qwen3 and trained on trillions of multilingual and translation tokens. The post describes reinforcement learning as a key part of the training process, crediting it with “significant improvements in translation accuracy and linguistic fluency.”

The headline numbers are coverage and cost: 92 languages supported, pricing as low as $0.5 per million output tokens, and a lightweight Mixture-of-Experts architecture designed for speed in high-concurrency environments.

Coverage and benchmark results

The 92 languages span “major official languages and prominent dialects,” which the post says covers “over 95% of the global population.” The announcement does not list all 92 languages but identifies ten that received human evaluation: Chinese, English, Japanese, Korean, Thai, Arabic, Italian, Russian, Spanish, and French.

On automatic benchmarks covering Chinese-English and English-German translation, as well as the WMT24 multilingual translation benchmark, the post reports that qwen-mt-turbo “significantly outperforms comparably-sized models including GPT-4.1-mini, Gemini-2.5-Flash, and Qwen3-8B.” The post also claims that even against “state-of-the-art large language models such as GPT-4.1, Gemini-2.5-Pro, and Qwen3-235B-A22B, Qwen-MT maintains competitive translation quality” while running on a smaller architecture.

Human evaluation used three independent professional translators per sample with cross-validation, across the ten major languages listed above. The post says qwen-mt-turbo “achieved superior performance metrics, demonstrating significant advantages in both acceptance rates and excellence rates.” No specific numerical results for acceptance or excellence rates are provided in the excerpt.

Customization features

Three customization mechanisms are available via API parameters: terminology intervention, domain prompts, and translation memory.

Terminology intervention lets callers inject fixed term pairs so the model applies user-specified vocabulary consistently. The post gives a worked example in which four Chinese technical terms — “生物传感器,” “石墨烯,” “化学元素,” and “身体健康状况” — are predefined and passed as a terms list in the request body. The model then maps them to “biological sensor,” “graphene,” “chemical elements,” and “health status of the body” throughout the output.

Domain prompts allow callers to pass natural-language descriptions of the target style and subject area. The post’s example instructs the model to translate a Chinese SQL documentation sentence using “Ali Cloud IT domain” conventions and professional troubleshooting terminology. The resulting output preserves the technical structure (“The second SELECT statement returns a number that indicates how many rows were returned by the first SELECT statement without LIMIT clause”) without paraphrasing.

The post notes that this style-adaptation capability is especially relevant when register must vary by context: formal language for legal documents, conversational language for social media.

API integration

The model is accessed through Qwen’s OpenAI-compatible API endpoint at dashscope-intl.aliyuncs.com. Callers use the standard openai.OpenAI client with a DASHSCOPE_API_KEY and pass translation_options in the extra_body parameter. Source language can be set to auto for detection or specified explicitly.

The basic request structure adds a translation_options dict containing source_lang and target_lang, with optional terms, domains, and translation memory fields. This means existing code using the OpenAI SDK requires minimal changes to call qwen-mt-turbo.

Cost and deployment positioning

The $0.5 per million output token price point is the core competitive claim. The post describes this as “particularly well-suited for high-concurrency environments and latency-sensitive applications” — the implied comparison is to general-purpose large frontier models that carry higher per-token costs for translation workloads where raw reasoning capability is not the primary bottleneck.

The Qwen team frames the model’s goals as “faithfulness, fluency, and elegance,” acknowledging that perfect translation “remains an ongoing journey filled with challenges.” The post states the team will “continue to enhance translation accuracy and naturalness, expand coverage to more languages,” without committing to a specific roadmap.

For teams running translation at volume — content pipelines, localization, document processing — qwen-mt-turbo offers a purpose-trained alternative to using general models. The combination of reinforcement-learning-tuned quality, sub-dollar per million token pricing, and built-in term injection covers most of the customization needs that domain-specific translation typically requires.