GPT-4O-TRANSCRIBE-DIARIZE, built by OpenAI, is a chat model capable of speech-to-text and text generation, among other functions, with a notable technical trait of having a large context window of 128,000 tokens. It is genuinely best at processing and transcribing audio inputs, leveraging its speech-to-text capability to generate text from spoken language.
Input
Output
Context
128K
Max Output
2K
Parameters
-
Input Modalities
Output Modalities
Data sourced from official provider APIs and documentation
Last updated: Jun 23, 2026
Automatically route workloads to the right model for every task, every time.