DeepSeek-R1-0528 is a lightly upgraded release of DeepSeek R1 that taps more compute and smarter post-training tricks, pushing its reasoning and inference to the brink of flagship models like O3 and Gemini 2.5 Pro. It now tops math, programming, and logic leaderboards, showcasing a step-change in depth-of-thought. The distilled variant, DeepSeek-R1-0528-Qwen3-8B, transfers this chain-of-thought into an 8 B-parameter form, beating standard Qwen3 8B by +10 pp and tying the 235 B “thinking” giant on AIME 2024.
Input
Output
Context
128K
Max Output
32K
Parameters
8B
Input Modalities
Output Modalities
Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.
Data sourced from official provider APIs and documentation
Last updated: Mar 16, 2026
Explore models, compare pricing and benchmarks, and right-size your infrastructure — all in one place.