Microsoft's Table Transformer Structure Recognition is a vision model that excels at detecting table structures, such as rows and columns, from images. It is built on the Transformer-based object detection architecture, equivalent to DETR, with a notable technical trait of applying layernorm before self- and cross-attention.
Input
Output
Context
1K
Max Output
-
Parameters
28.8M
Input Modalities
Output Modalities
Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.
Data sourced from official provider APIs and documentation
Last updated: Jun 23, 2026
Automatically route workloads to the right model for every task, every time.