Developed by Moonshot AI, Kimi Linear 48B A 3B Base is a chat model that excels at handling long-context tasks, achieving notable performance on benchmarks such as RULER with a score of 84.3. Its hybrid linear attention architecture, featuring Kimi Delta Attention, enables significant speedups and reduced memory usage. With a context length of up to 1M tokens, this model demonstrates its capability to efficiently process lengthy inputs.
Input
Output
Context
1000K
Max Output
-
Parameters
49.1B
Input Modalities
Output Modalities
Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.
Data sourced from official provider APIs and documentation
Last updated: Jun 23, 2026
Automatically route workloads to the right model for every task, every time.