large language

Qwen LogoQwen3 Coder 30B

Small mixture-of-experts LLM with advanced coding and reasoning capabilities optimized for fast inference

Model details

View repository

Example usage

Run Qwen 3 Coder 30B (Flash) on an H100 GPU.

Qwen3 has shown strong performance on math and reasoning tasks, but running it in production requires a highly optimized inference stack to avoid excessive latency.

Deployments of Qwen3 are OpenAI-compatible.

Input
1from openai import OpenAI
2import os
3
4model_url = "" # Copy in from API pane in Baseten model dashboard
5
6client = OpenAI(
7    api_key=os.environ['BASETEN_API_KEY'],
8    base_url=model_url
9)
10
11# Chat completion
12response_chat = client.chat.completions.create(
13    model="",
14    messages=[
15        {"role": "user", "content": "Write FizzBuzz."}
16    ],
17    temperature=0.6,
18    max_tokens=100,
19)
20print(response_chat)
JSON output
1{
2    "id": "143",
3    "choices": [
4        {
5            "finish_reason": "stop",
6            "index": 0,
7            "logprobs": null,
8            "message": {
9                "content": "[Model output here]",
10                "role": "assistant",
11                "audio": null,
12                "function_call": null,
13                "tool_calls": null
14            }
15        }
16    ],
17    "created": 1741224586,
18    "model": "",
19    "object": "chat.completion",
20    "service_tier": null,
21    "system_fingerprint": null,
22    "usage": {
23        "completion_tokens": 145,
24        "prompt_tokens": 38,
25        "total_tokens": 183,
26        "completion_tokens_details": null,
27        "prompt_tokens_details": null
28    }
29}

large language models

See all
Kimi
Model API
LLM

Kimi K2 0905

0905 - K2
DeepSeek Logo
Model API
LLM

DeepSeek V3.1

V3.1 - B200
Qwen Logo
Model API
LLM

Qwen3 235B 2507

2507

Qwen models

See all
Qwen Logo
Model API
LLM

Qwen3 235B 2507

2507
Qwen Logo
Model API
LLM

Qwen3 Coder 480B

3 - Coder
Qwen Logo
LLM

Qwen3 Coder 30B

3 - Coder

🔥 Trending models