Phi 3.5 Vision Instruct is a language model from Microsoft. It features a 128K context window, image inputs alongside text.
Input
Output
Context
128K
Max Output
4K
Parameters
—
Input Modalities
Output Modalities
Features
Data sourced from official provider APIs and documentation
Last updated: May 5, 2026
From model selection to production, one platform, no fragmentation.