NVIDIA's Segformer B 1 Finetuned Ade 512 512 is a vision model exceling at semantic segmentation tasks, particularly on datasets like ADE20K. It utilizes a hierarchical Transformer encoder paired with a lightweight all-MLP decode head, achieving notable results. The model is fine-tuned at a resolution of 512x512, demonstrating its capability to handle high-resolution images.
Input
Output
Context
-
Max Output
-
Parameters
-
Input Modalities
Output Modalities
Data sourced from official provider APIs and documentation
Last updated: Jun 23, 2026
Automatically route workloads to the right model for every task, every time.