Developed by Moonshot AI, Kimi VL A 3B Thinking is an open-source Mixture-of-Experts vision-language model that excels in advanced multimodal reasoning, long-context understanding, and agent capabilities. It is genuinely best at handling multi-turn agent interaction tasks and diverse challenging vision language tasks, including image and video comprehension, OCR, and mathematical reasoning.
Input
Output
Context
131K
Max Output
32K
Parameters
16.4B
Input Modalities
Output Modalities
Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.
Data sourced from official provider APIs and documentation
Last updated: Jun 24, 2026
Automatically route workloads to the right model for every task, every time.