Qwen

Qwen 3 Omni 30B A3B Instruct

Name: Qwen 3 Omni 30B A3B Instruct
Author: Qwen

Qwen develops the Qwen 3 Omni 30B A 3B Instruct model, a natively end-to-end multilingual omni-modal foundation model that processes text, images, audio, and video, delivering real-time streaming responses. It is genuinely best at achieving state-of-the-art results across modalities, particularly in audio and audio-video benchmarks, with a notable technical trait being its MoE-based Thinker–Talker design with AuT pretraining.

Input

Output

Context

66K

Max Output

66K

Parameters

35.3B

Technical Specifications

Model TypeChat

Context Window66,000 tokens

Max Output Tokens66,000 tokens

Parameters35.3B

Release DateSep 20, 2025

Training CutoffNot available

Licenseother

Open SourceYes

Input Modalities

TextImageAudio

Output Modalities

TextImageAudio

Capabilities

Benchmarks

Artificial Analysis

5.1%

HLE

62.0%

GPQA

18.6%

SciCode

72.5%

MMLU Pro

52.3

Math

7.2

Coding

42.2%

LiveCodeBench

5.1

Intelligence

105

Speed (tok/s)

0.879

TTFA (s)

0.879

TTFT (s)

Resources & Links

HuggingFace

Model card on HuggingFace

Estimated GPU Requirements for Qwen 3 Omni 30B A3B Instruct

Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.

NVIDIA A100 SXM 40GB

40 GB VRAM

100% used

NVIDIA L40

48 GB VRAM

83% used

NVIDIA L40S

48 GB VRAM

83% used

AMD Instinct MI210

64 GB VRAM

62% used

Use GPU Sizing Calculator for custom configurations

Browse More Models

Related Tools

Compare This Model

Compare this model against top alternatives

Browse All Models

Explore other models in the catalog

Data sourced from official provider APIs and documentation

Last updated: Jun 23, 2026

Start building with the right model.

Automatically route workloads to the right model for every task, every time.

Start Building Read the docs

Inferbase

Back to Models

Qwen

Qwen 3 Omni 30B A3B Instruct

Add to Compare

Input

Output

Context

66K

Max Output

66K

Parameters

35.3B

Technical Specifications

Model TypeChat

Context Window66,000 tokens

Max Output Tokens66,000 tokens

Parameters35.3B

Release DateSep 20, 2025

Training CutoffNot available

Licenseother

Open SourceYes

Input Modalities

TextImageAudio

Output Modalities

TextImageAudio

Capabilities

Benchmarks

Artificial Analysis

5.1%

HLE

62.0%

GPQA

18.6%

SciCode

72.5%

MMLU Pro

52.3

Math

7.2

Coding

42.2%

LiveCodeBench

5.1

Intelligence

105

Speed (tok/s)

0.879

TTFA (s)

0.879

TTFT (s)

Resources & Links

HuggingFace

Model card on HuggingFace

Estimated GPU Requirements for Qwen 3 Omni 30B A3B Instruct

Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.

NVIDIA A100 SXM 40GB

40 GB VRAM

100% used

NVIDIA L40

48 GB VRAM

83% used

NVIDIA L40S

48 GB VRAM

83% used

AMD Instinct MI210

64 GB VRAM

62% used

Use GPU Sizing Calculator for custom configurations

Browse More Models

Related Tools

Compare This Model

Compare this model against top alternatives

Browse All Models

Explore other models in the catalog

Data sourced from official provider APIs and documentation

Last updated: Jun 23, 2026

Start building with the right model.

Automatically route workloads to the right model for every task, every time.

Start Building Read the docs