Developed by Google, Siglip Base Patch 16 224 is a multimodal model exceling at tasks such as zero-shot image classification and image-text retrieval. Its sigmoid loss function allows for better performance, particularly at smaller batch sizes, and enables scaling up the batch size.
Input
Output
Context
-
Max Output
-
Parameters
203.2M
Input Modalities
Output Modalities
Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.
Data sourced from official provider APIs and documentation
Last updated: Jun 23, 2026
Automatically route workloads to the right model for every task, every time.