Inferbase

Beta

Inferbase

Beta

Inferbase

Beta

NVIDIA•2018

NVIDIA V100 SXM2 32GB

Name: NVIDIA V100 SXM2 32GB
Brand: NVIDIA
Price: 8000 USD
Availability: InStock

Datacenter

The V100 pioneered Tensor Cores and mixed-precision training with 32GB HBM2 memory. A landmark GPU that defined modern AI infrastructure, still deployed for many workloads.

VRAM

32 GB

Memory

HBM2

Bandwidth

900 GB/s

TDP

300W

Suitable Workloads

Smaller Language Models

Inference for 7B-13B parameter models

Enterprise Deployment

Designed for 24/7 datacenter operations

Key Highlights

Built on Volta architecture
640 AI accelerator cores for tensor operations
5,120 compute cores for parallel processing
SXM2 form factor

Memory

VRAM32 GB

Memory TypeHBM2

Memory Bandwidth900 GB/s

Compute Performance

FP32 (Single Precision)15.7 TFLOPS

FP16 (Half Precision)125 TFLOPS

Architecture

ArchitectureVolta

Compute Cores5,120

AI Accelerators640

Release Year2018

Power & Physical

TDP300 W

Max Power300 W

Form FactorSXM2

PCIe GenerationGen3

Interconnect

Multi-GPU SupportYes

Interconnect Bandwidth300 GB/s

Pricing

MSRP$8,000

Notes

Previous gen flagship

Models That May Fit on NVIDIA V100 SXM2 32GB

Estimates based on INT8 quantization. Actual fit depends on framework and batch size.

Qwen 2.5 14B Instruct

Qwen · 14B

~16.8 GB

Llama 3.2 1B Instruct Qlora_int 4_EO 8

Meta AI · 1B

~1.2 GB

Falcon 3 3B Base

Technology Innovation Institute · 3B

~3.6 GB

Qwen 2.5 Coder 7B Instruct AWQ

Qwen · 7B

~8.4 GB

Browse all AI models

Related Tools

GPU Sizing Calculator

Calculate VRAM requirements for models

Browse All GPUs

Compare datacenter GPU specifications

AI Model Catalog

Browse and compare AI models

Added Jan 25, 2026

Last updated: Jan 25, 2026

Build Your AI Stack with Confidence

Explore models, compare pricing and benchmarks, and right-size your infrastructure — all in one place.

Get Started Compare Models

Inferbase

Beta

Inferbase

Beta

NVIDIA•2018

NVIDIA V100 SXM2 32GB

Datacenter

The V100 pioneered Tensor Cores and mixed-precision training with 32GB HBM2 memory. A landmark GPU that defined modern AI infrastructure, still deployed for many workloads.

VRAM

32 GB

Memory

HBM2

Bandwidth

900 GB/s

TDP

300W

Suitable Workloads

Smaller Language Models

Inference for 7B-13B parameter models

Enterprise Deployment

Designed for 24/7 datacenter operations

Key Highlights

Built on Volta architecture
640 AI accelerator cores for tensor operations
5,120 compute cores for parallel processing
SXM2 form factor

Memory

VRAM32 GB

Memory TypeHBM2

Memory Bandwidth900 GB/s

Compute Performance

FP32 (Single Precision)15.7 TFLOPS

FP16 (Half Precision)125 TFLOPS

Architecture

ArchitectureVolta

Compute Cores5,120

AI Accelerators640

Release Year2018

Power & Physical

TDP300 W

Max Power300 W

Form FactorSXM2

PCIe GenerationGen3

Interconnect

Multi-GPU SupportYes

Interconnect Bandwidth300 GB/s

Pricing

MSRP$8,000

Notes

Previous gen flagship

Models That May Fit on NVIDIA V100 SXM2 32GB

Estimates based on INT8 quantization. Actual fit depends on framework and batch size.

Qwen 2.5 14B Instruct

Qwen · 14B

~16.8 GB

Llama 3.2 1B Instruct Qlora_int 4_EO 8

Meta AI · 1B

~1.2 GB

Falcon 3 3B Base

Technology Innovation Institute · 3B

~3.6 GB

Qwen 2.5 Coder 7B Instruct AWQ

Qwen · 7B

~8.4 GB

Browse all AI models

Related Tools

GPU Sizing Calculator

Calculate VRAM requirements for models

Browse All GPUs

Compare datacenter GPU specifications

AI Model Catalog

Browse and compare AI models

Added Jan 25, 2026

Last updated: Jan 25, 2026

Build Your AI Stack with Confidence

Explore models, compare pricing and benchmarks, and right-size your infrastructure — all in one place.

Get Started Compare Models