NVIDIA

Bigvgan_v2_44khz_128band_512x

Name: Bigvgan_v2_44khz_128band_512x
Author: NVIDIA

NVIDIA's BigVGAN is an open-source audio model capable of audio processing and text-to-speech tasks. It is genuinely best at handling diverse audio types, including speech in multiple languages, environmental sounds, and instruments, thanks to its large-scale training on varied datasets.

Input

Output

Context

Max Output

Parameters

Technical Specifications

Model TypeAudio

Context WindowNot available

Max Output TokensNot available

ParametersNot available

Release DateJul 15, 2024

Training CutoffNot available

Licensemit

Open SourceYes

Input Modalities

Audio

Output Modalities

Audio

Capabilities

Resources & Links

HuggingFace

Model card on HuggingFace

Browse More Models

Related Tools

Compare This Model

Compare this model against top alternatives

Browse All Models

Explore other models in the catalog

Data sourced from official provider APIs and documentation

Last updated: Jun 23, 2026

Start building with the right model.

Automatically route workloads to the right model for every task, every time.

Start Building Read the docs

Inferbase

Back to Models

NVIDIA

Bigvgan_v2_44khz_128band_512x

Add to Compare

Input

Output

Context

Max Output

Parameters

Technical Specifications

Model TypeAudio

Context WindowNot available

Max Output TokensNot available

ParametersNot available

Release DateJul 15, 2024

Training CutoffNot available

Licensemit

Open SourceYes

Input Modalities

Audio

Output Modalities

Audio

Capabilities

Resources & Links

HuggingFace

Model card on HuggingFace

Browse More Models

Related Tools

Compare This Model

Compare this model against top alternatives

Browse All Models

Explore other models in the catalog

Data sourced from official provider APIs and documentation

Last updated: Jun 23, 2026

Start building with the right model.

Automatically route workloads to the right model for every task, every time.

Start Building Read the docs