The first multi-modal, text+image-to-text model from Mistral AI. Its weights were launched via torrent: https://x.com/mistralai/status/1833758285167722836.
Input
Output
Context
33K
Max Output
4K
Parameters
12B
Input Modalities
Output Modalities
Features
Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.
Data sourced from official provider APIs and documentation
Last updated: Apr 10, 2026
From model selection to production, one platform, no fragmentation.