The H200 is NVIDIA's latest flagship AI accelerator, featuring 141GB of HBM3e memory for handling the largest language models. Built on Hopper architecture, it delivers exceptional performance for both training and inference workloads.
VRAM
141 GB
Memory
HBM3e
Bandwidth
4800 GB/s
TDP
700W
Large Language Models
Training and inference for models like GPT-4, Llama 70B+
Deep Learning Training
High-performance training for neural networks
Distributed Training
Multi-node training with fast interconnects
High-Throughput Inference
Optimized for batched inference workloads
Optimized for large language models with 141GB HBM3e
Estimates based on INT8 quantization. Actual fit depends on framework and batch size.
Added Jan 25, 2026
Last updated: Jan 25, 2026
Explore models, compare pricing and benchmarks, and right-size your infrastructure — all in one place.