Google's compact 2 billion parameter model from the Gemma 3n series, optimized for efficient inference. Features an 8K context window and is released under the Gemma license. Built for lightweight deployments where speed and low memory usage matter more than peak capability.
Input
Output
Context
8K
Max Output
2K
Parameters
2B
Input Modalities
Output Modalities
Features
Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.
Data sourced from official provider APIs and documentation
Last updated: May 5, 2026
From model selection to production, one platform, no fragmentation.