NVIDIA's flagship Blackwell-architecture datacenter GPU with 192 GB of HBM3e memory and 8 TB/s of memory bandwidth. Designed for trillion-parameter LLM training and large-scale inference, with 5th-generation Tensor Cores and FP4 precision support.
VRAM
192 GB
Memory
HBM3e
Bandwidth
8000 GB/s
TDP
1000W
Large Language Models
Training and inference for models like GPT-4, Llama 70B+
Deep Learning Training
High-performance training for neural networks
Distributed Training
Multi-node training with fast interconnects
High-Throughput Inference
Optimized for batched inference workloads
Compute throughput shown with 2:4 structured sparsity. Two-die package connected via NV-HBI. Liquid-cooled HGX SXM module.
Estimates based on INT8 quantization. Actual fit depends on framework and batch size.
Added Apr 30, 2026
Last updated: Apr 30, 2026
From model selection to production, one platform, no fragmentation.