Beta

Inferbase

Beta

Inferbase

Beta

NVIDIA•2025

NVIDIA B200 SXM

Name: NVIDIA B200 SXM
Brand: NVIDIA
Price: 35000 USD
Availability: InStock

Datacenter

NVIDIA's flagship Blackwell-architecture datacenter GPU with 192 GB of HBM3e memory and 8 TB/s of memory bandwidth. Designed for trillion-parameter LLM training and large-scale inference, with 5th-generation Tensor Cores and FP4 precision support.

VRAM

192 GB

Memory

HBM3e

Bandwidth

8000 GB/s

TDP

1000W

Suitable Workloads

Large Language Models

Training and inference for models like GPT-4, Llama 70B+

Deep Learning Training

High-performance training for neural networks

Distributed Training

Multi-node training with fast interconnects

High-Throughput Inference

Optimized for batched inference workloads

Key Highlights

Built on Blackwell architecture
SXM form factor

Memory

VRAM192 GB

Memory TypeHBM3e

Memory Bandwidth8000 GB/s

Compute Performance

FP32 (Single Precision)2250 TFLOPS

FP16 (Half Precision)4500 TFLOPS

BF164500 TFLOPS

INT89000 TOPS

Architecture

ArchitectureBlackwell

Release Year2025

Power & Physical

TDP1000 W

Max Power1200 W

Form FactorSXM

PCIe GenerationGen5

Interconnect

Multi-GPU SupportYes

Interconnect Bandwidth1800 GB/s

Pricing

MSRP$35,000

Notes

Compute throughput shown with 2:4 structured sparsity. Two-die package connected via NV-HBI. Liquid-cooled HGX SXM module.

Models That May Fit on NVIDIA B200 SXM

Estimates based on INT8 quantization. Actual fit depends on framework and batch size.

Nvidia Nemotron 3 Super 120B A 12B BF 16

NVIDIA · 120B

~126.0 GB

mistral-large-2512

Mistral · 123B

~128.9 GB

Nemotron 3 Nano 30B A3B (free)

NVIDIA · 30B

~34.5 GB

Codellama 34B Instruct Hf

Meta AI · 34B

~35.6 GB

Browse all AI models

Explore More

Inference API

Run models directly through our API with smart routing

Model Recommender

Describe your use case and get ranked recommendations

GPU Capacity Planner

Calculate VRAM and compute requirements for self-hosting

Added Apr 30, 2026

Last updated: Apr 30, 2026

Start building with the right model.

From model selection to production, one platform, no fragmentation.

Start Building Explore Models

Inferbase

Beta

Inferbase

Beta

NVIDIA•2025

NVIDIA B200 SXM

Datacenter

VRAM

192 GB

Memory

HBM3e

Bandwidth

8000 GB/s

TDP

1000W

Suitable Workloads

Large Language Models

Training and inference for models like GPT-4, Llama 70B+

Deep Learning Training

High-performance training for neural networks

Distributed Training

Multi-node training with fast interconnects

High-Throughput Inference

Optimized for batched inference workloads

Key Highlights

Built on Blackwell architecture
SXM form factor

Memory

VRAM192 GB

Memory TypeHBM3e

Memory Bandwidth8000 GB/s

Compute Performance

FP32 (Single Precision)2250 TFLOPS

FP16 (Half Precision)4500 TFLOPS

BF164500 TFLOPS

INT89000 TOPS

Architecture

ArchitectureBlackwell

Release Year2025

Power & Physical

TDP1000 W

Max Power1200 W

Form FactorSXM

PCIe GenerationGen5

Interconnect

Multi-GPU SupportYes

Interconnect Bandwidth1800 GB/s

Pricing

MSRP$35,000

Notes

Compute throughput shown with 2:4 structured sparsity. Two-die package connected via NV-HBI. Liquid-cooled HGX SXM module.

Models That May Fit on NVIDIA B200 SXM

Estimates based on INT8 quantization. Actual fit depends on framework and batch size.

Nvidia Nemotron 3 Super 120B A 12B BF 16

NVIDIA · 120B

~126.0 GB

mistral-large-2512

Mistral · 123B

~128.9 GB

Nemotron 3 Nano 30B A3B (free)

NVIDIA · 30B

~34.5 GB

Codellama 34B Instruct Hf

Meta AI · 34B

~35.6 GB

Browse all AI models

Explore More

Inference API

Run models directly through our API with smart routing

Model Recommender

Describe your use case and get ranked recommendations

GPU Capacity Planner

Calculate VRAM and compute requirements for self-hosting

Added Apr 30, 2026

Last updated: Apr 30, 2026

Start building with the right model.

From model selection to production, one platform, no fragmentation.

Start Building Explore Models