Our Series E: we raised $300M at a $5B valuation to power a multi-model future. READ
large language

Qwen LogoQwen3 235B 2507

Mixture-of-experts LLM with math and reasoning capabilities

Model details

Example usage

Baseten offers Dedicated Deployments and Model APIs for Qwen3 235B A22B Instruct 2507 powered by the Baseten Inference Stack.

Qwen3 has shown strong performance on math and reasoning tasks, but running it in production requires a highly optimized inference stack to avoid excessive latency.

Deployments of Qwen3 are OpenAI-compatible.

Input

1from openai import OpenAI
2import os
3
4model_url = "" # Copy in from API pane in Baseten model dashboard
5
6client = OpenAI(
7    api_key=os.environ['BASETEN_API_KEY'],
8    base_url=model_url
9)
10
11# Chat completion
12response_chat = client.chat.completions.create(
13    model="",
14    messages=[
15        {"role": "user", "content": "Write FizzBuzz."}
16    ],
17    temperature=0.6,
18    max_tokens=100,
19)
20print(response_chat)
JSON output
1{
2    "id": "143",
3    "choices": [
4        {
5            "finish_reason": "stop",
6            "index": 0,
7            "logprobs": null,
8            "message": {
9                "content": "[Model output here]",
10                "role": "assistant",
11                "audio": null,
12                "function_call": null,
13                "tool_calls": null
14            }
15        }
16    ],
17    "created": 1741224586,
18    "model": "",
19    "object": "chat.completion",
20    "service_tier": null,
21    "system_fingerprint": null,
22    "usage": {
23        "completion_tokens": 145,
24        "prompt_tokens": 38,
25        "total_tokens": 183,
26        "completion_tokens_details": null,
27        "prompt_tokens_details": null
28    }
29}

large language models

See all
Kimi
LLM

Kimi K2.5

2.5
DeepSeek Logo
LLM

DeepSeek V3.2

V3.2 - B200

Qwen models

See all
Qwen Logo
Model API
LLM

Qwen3 Coder 480B

3 - Coder
Qwen Logo
LLM

Qwen 3 32B

V3 - TRT-LLM - H100

🔥 Trending models