Developed by Microsoft, Layoutlmv 2 Base Uncased is a multimodal chat model that excels at visually-rich document understanding tasks, leveraging its pre-training on interactions among text, layout, and image. Its notable technical trait is a context window of 512 tokens, allowing it to process relatively long sequences of text.
Input
Output
Context
1K
Max Output
-
Parameters
200M
Input Modalities
Output Modalities
Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.
Data sourced from official provider APIs and documentation
Last updated: Jun 24, 2026
Automatically route workloads to the right model for every task, every time.