Qwen

qwen3-235b-a22b-thinking-2507

Name: qwen3-235b-a22b-thinking-2507
Author: Qwen

Qwen's Qwen3 235B A22B Thinking 2507 FP8 is a chat model capable of streaming, function calling, extended thinking, and producing structured output, with a context window of 262,144 tokens. It excels in mathematical tasks, as evidenced by its top 25% or better scores in Math Index, MMLU-Pro, LiveCodeBench, AIME, and MATH-500. Notably, its architecture supports an exceptionally large context window, allowing it to process and respond to lengthy inputs.

Input

Output

Context

262K

Max Output

Parameters

235B

Technical Specifications

Model TypeChat

Context Window262,144 tokens

Max Output TokensNot available

Parameters235B

Release DateJul 25, 2025

Training CutoffNot available

Licenseapache-2.0

Open SourceYes

Input Modalities

Text

Output Modalities

Text

Capabilities

Benchmarks

Artificial Analysis

15.0%

HLE

94.0%

AIME

79.0%

GPQA

42.4%

SciCode

98.4%

MATH 500

84.3%

MMLU Pro

91.0

Math

23.2

Coding

78.8%

LiveCodeBench

22.3

Intelligence

53.7

Speed (tok/s)

38.6

TTFA (s)

1.4

TTFT (s)

Resources & Links

HuggingFace

Model card on HuggingFace

Estimated GPU Requirements for qwen3-235b-a22b-thinking-2507

Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.

AMD Instinct MI325X

256 GB VRAM

93% used

AMD Instinct MI355X

288 GB VRAM

83% used

AMD Instinct MI350X

288 GB VRAM

83% used

2× AMD Instinct MI250X

256 GB VRAM

93% used

Use GPU Sizing Calculator for custom configurations

Browse More Models

Related Tools

Compare This Model

Compare this model against top alternatives

Browse All Models

Explore other models in the catalog

Data sourced from official provider APIs and documentation

Last updated: Jun 24, 2026

Start building with the right model.

Automatically route workloads to the right model for every task, every time.

Start Building Read the docs

Inferbase

Back to Models

Qwen

qwen3-235b-a22b-thinking-2507

Try in Playground Add to Compare

Input

Output

Context

262K

Max Output

Parameters

235B

Technical Specifications

Model TypeChat

Context Window262,144 tokens

Max Output TokensNot available

Parameters235B

Release DateJul 25, 2025

Training CutoffNot available

Licenseapache-2.0

Open SourceYes

Input Modalities

Text

Output Modalities

Text

Capabilities

Benchmarks

Artificial Analysis

15.0%

HLE

94.0%

AIME

79.0%

GPQA

42.4%

SciCode

98.4%

MATH 500

84.3%

MMLU Pro

91.0

Math

23.2

Coding

78.8%

LiveCodeBench

22.3

Intelligence

53.7

Speed (tok/s)

38.6

TTFA (s)

1.4

TTFT (s)

Resources & Links

HuggingFace

Model card on HuggingFace

Estimated GPU Requirements for qwen3-235b-a22b-thinking-2507

Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.

AMD Instinct MI325X

256 GB VRAM

93% used

AMD Instinct MI355X

288 GB VRAM

83% used

AMD Instinct MI350X

288 GB VRAM

83% used

2× AMD Instinct MI250X

256 GB VRAM

93% used

Use GPU Sizing Calculator for custom configurations

Browse More Models

Related Tools

Compare This Model

Compare this model against top alternatives

Browse All Models

Explore other models in the catalog

Data sourced from official provider APIs and documentation

Last updated: Jun 24, 2026

Start building with the right model.

Automatically route workloads to the right model for every task, every time.

Start Building Read the docs