Qwen3-4B is a 4 billion parameter dense language model from the Qwen3 series, designed to support both general-purpose and reasoning-intensive tasks. It introduces a dual-mode architecture—thinking and non-thinking—allowing dynamic switching between high-precision logical reasoning and efficient dialogue generation.
Input
Output
Context
41K
Max Output
8K
Parameters
4B
Input Modalities
Output Modalities
Features
Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.
Data sourced from official provider APIs and documentation
Last updated: Apr 27, 2026
From model selection to production, one platform, no fragmentation.