Whisper Large V3 Turbo is an automatic speech recognition and speech translation model developed by OpenAI. It is a finetuned and pruned version of Whisper Large V3, optimized for significantly faster inference by reducing the number of decoding layers from 32 to 4, with a minor trade-off in quality. The model was trained on over 5 million hours of labeled data and demonstrates strong zero-shot generalization across many datasets and domains.
Input
Output
Context
-
Max Output
-
Parameters
808.9M
Input Modalities
Output Modalities
Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.
Data sourced from official provider APIs and documentation
Last updated: Jun 24, 2026
Automatically route workloads to the right model for every task, every time.