Qwen

Qwen 3.5 397B A17B FP8

Name: Qwen 3.5 397B A17B FP8
Author: Qwen

Qwen develops the Qwen 3.5 397B A 17B model, a chat model that excels at tasks such as code generation, reasoning, and text generation, with a notable strength in multimodal understanding thanks to its unified vision-language foundation. Its efficient hybrid architecture, combining Gated Delta Networks and sparse Mixture-of-Experts, enables high-throughput inference with minimal latency.

Input

Output

Context

262K

Max Output

66K

Parameters

397B

Technical Specifications

Model TypeChat

Context Window262,144 tokens

Max Output Tokens66,000 tokens

Parameters397B

Release DateFeb 18, 2026

Training CutoffNov 1, 2025

Licenseapache-2.0

Open SourceYes

Input Modalities

TextImage

Output Modalities

Text

Capabilities

Benchmarks

Artificial Analysis

27.3%

HLE

89.3%

GPQA

42.0%

SciCode

41.3

Coding

33.7

Intelligence

50.5

Speed (tok/s)

64.8

TTFA (s)

1.7

TTFT (s)

Resources & Links

HuggingFace

Model card on HuggingFace

Estimated GPU Requirements for Qwen 3.5 397B A17B FP8

Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.

2× AMD Instinct MI325X

512 GB VRAM

77% used

2× AMD Instinct MI355X

576 GB VRAM

69% used

2× AMD Instinct MI350X

576 GB VRAM

69% used

Use GPU Sizing Calculator for custom configurations

Browse More Models

Related Tools

Compare This Model

Compare this model against top alternatives

Browse All Models

Explore other models in the catalog

Data sourced from official provider APIs and documentation

Last updated: Jun 24, 2026

Start building with the right model.

Automatically route workloads to the right model for every task, every time.

Start Building Read the docs

Inferbase

Back to Models

Qwen

Qwen 3.5 397B A17B FP8

Try in Playground Add to Compare

Input

Output

Context

262K

Max Output

66K

Parameters

397B

Technical Specifications

Model TypeChat

Context Window262,144 tokens

Max Output Tokens66,000 tokens

Parameters397B

Release DateFeb 18, 2026

Training CutoffNov 1, 2025

Licenseapache-2.0

Open SourceYes

Input Modalities

TextImage

Output Modalities

Text

Capabilities

Benchmarks

Artificial Analysis

27.3%

HLE

89.3%

GPQA

42.0%

SciCode

41.3

Coding

33.7

Intelligence

50.5

Speed (tok/s)

64.8

TTFA (s)

1.7

TTFT (s)

Resources & Links

HuggingFace

Model card on HuggingFace

Estimated GPU Requirements for Qwen 3.5 397B A17B FP8

Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.

2× AMD Instinct MI325X

512 GB VRAM

77% used

2× AMD Instinct MI355X

576 GB VRAM

69% used

2× AMD Instinct MI350X

576 GB VRAM

69% used

Use GPU Sizing Calculator for custom configurations

Browse More Models

Related Tools

Compare This Model

Compare this model against top alternatives

Browse All Models

Explore other models in the catalog

Data sourced from official provider APIs and documentation

Last updated: Jun 24, 2026

Start building with the right model.

Automatically route workloads to the right model for every task, every time.

Start Building Read the docs