Inferbase

Beta

Inferbase

Beta

Inferbase

Beta

NVIDIA•2023

NVIDIA H100 NVL

Name: NVIDIA H100 NVL
Brand: NVIDIA

Datacenter

The H100 NVL pairs two H100 GPUs connected via NVLink for 94GB combined memory per GPU. Optimized for large language model inference in PCIe-based systems.

VRAM

94 GB

Memory

HBM3

Bandwidth

3900 GB/s

TDP

400W

Suitable Workloads

Large Language Models

Training and inference for models like GPT-4, Llama 70B+

Deep Learning Training

High-performance training for neural networks

Distributed Training

Multi-node training with fast interconnects

High-Throughput Inference

Optimized for batched inference workloads

Key Highlights

Built on Hopper architecture
456 AI accelerator cores for tensor operations
14,592 compute cores for parallel processing
PCIe Dual-slot form factor

Memory

VRAM94 GB

Memory TypeHBM3

Memory Bandwidth3900 GB/s

Compute Performance

FP32 (Single Precision)989 TFLOPS

FP16 (Half Precision)1979 TFLOPS

BF161979 TFLOPS

INT83958 TOPS

Architecture

ArchitectureHopper

Compute Cores14,592

AI Accelerators456

Release Year2023

Power & Physical

TDP400 W

Max Power400 W

Form FactorPCIe Dual-slot

PCIe GenerationGen5 x16

Interconnect

Multi-GPU SupportYes

Interconnect Bandwidth600 GB/s

Notes

Dual-GPU for inference

Models That May Fit on NVIDIA H100 NVL

Estimates based on INT8 quantization. Actual fit depends on framework and batch size.

Llama 3.3 70B Instruct

Meta AI · 70B

~84.1 GB

Llama 3.1 70B

Meta AI · 70B

~84.0 GB

Codellama 34B Instruct Hf

Meta AI · 34B

~40.8 GB

Yi 34B Chat 4bits

01.AI · 34B

~40.8 GB

Browse all AI models

Related Tools

GPU Sizing Calculator

Calculate VRAM requirements for models

Browse All GPUs

Compare datacenter GPU specifications

AI Model Catalog

Browse and compare AI models

Added Jan 25, 2026

Last updated: Jan 25, 2026

Build Your AI Stack with Confidence

Explore models, compare pricing and benchmarks, and right-size your infrastructure — all in one place.

Get Started Compare Models