NVIDIA's Segformer B 5 Finetuned Cityscapes 1024 1024 is a fine-tuned SegFormer model, exceling at semantic segmentation tasks, particularly on datasets like Cityscapes. Its hierarchical Transformer encoder, pre-trained on ImageNet-1k, is a notable technical trait that contributes to its performance.
Input
Output
Context
1K
Max Output
-
Parameters
84.7M
Input Modalities
Output Modalities
Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.
Data sourced from official provider APIs and documentation
Last updated: Jun 23, 2026
Automatically route workloads to the right model for every task, every time.