Inferbase

Beta

Inferbase

Beta

Inferbase

Beta

NVIDIA•2020

NVIDIA A100 SXM 40GB

Name: NVIDIA A100 SXM 40GB
Brand: NVIDIA
Price: 10000 USD
Availability: InStock

Datacenter

The original A100 introduced the Ampere architecture with 3rd-gen Tensor Cores and 40GB HBM2e memory. Widely deployed across major cloud providers for AI and scientific computing.

VRAM

40 GB

Memory

HBM2e

Bandwidth

1555 GB/s

TDP

400W

Suitable Workloads

Medium Language Models

Inference for models up to 70B parameters

Distributed Training

Multi-node training with fast interconnects

Enterprise Deployment

Designed for 24/7 datacenter operations

Key Highlights

Built on Ampere architecture
432 AI accelerator cores for tensor operations
6,912 compute cores for parallel processing
SXM4 form factor

Memory

VRAM40 GB

Memory TypeHBM2e

Memory Bandwidth1555 GB/s

Compute Performance

FP32 (Single Precision)156 TFLOPS

FP16 (Half Precision)312 TFLOPS

BF16312 TFLOPS

INT8624 TOPS

Architecture

ArchitectureAmpere

Compute Cores6,912

AI Accelerators432

Release Year2020

Power & Physical

TDP400 W

Max Power400 W

Form FactorSXM4

PCIe GenerationGen4

Interconnect

Multi-GPU SupportYes

Interconnect Bandwidth600 GB/s

Pricing

MSRP$10,000

Notes

Original A100

Models That May Fit on NVIDIA A100 SXM 40GB

Estimates based on INT8 quantization. Actual fit depends on framework and batch size.

Qwen 2.5 14B Instruct

Qwen · 14B

~16.8 GB

Llama 3.2 1B Instruct Qlora_int 4_EO 8

Meta AI · 1B

~1.2 GB

Falcon 3 3B Base

Technology Innovation Institute · 3B

~3.6 GB

Qwen 2.5 Coder 7B Instruct AWQ

Qwen · 7B

~8.4 GB

Browse all AI models

Related Tools

GPU Sizing Calculator

Calculate VRAM requirements for models

Browse All GPUs

Compare datacenter GPU specifications

AI Model Catalog

Browse and compare AI models

Added Jan 25, 2026

Last updated: Jan 25, 2026

Build Your AI Stack with Confidence

Explore models, compare pricing and benchmarks, and right-size your infrastructure — all in one place.

Get Started Compare Models

Inferbase

Beta

Inferbase

Beta

NVIDIA•2020

NVIDIA A100 SXM 40GB

Datacenter

The original A100 introduced the Ampere architecture with 3rd-gen Tensor Cores and 40GB HBM2e memory. Widely deployed across major cloud providers for AI and scientific computing.

VRAM

40 GB

Memory

HBM2e

Bandwidth

1555 GB/s

TDP

400W

Suitable Workloads

Medium Language Models

Inference for models up to 70B parameters

Distributed Training

Multi-node training with fast interconnects

Enterprise Deployment

Designed for 24/7 datacenter operations

Key Highlights

Built on Ampere architecture
432 AI accelerator cores for tensor operations
6,912 compute cores for parallel processing
SXM4 form factor

Memory

VRAM40 GB

Memory TypeHBM2e

Memory Bandwidth1555 GB/s

Compute Performance

FP32 (Single Precision)156 TFLOPS

FP16 (Half Precision)312 TFLOPS

BF16312 TFLOPS

INT8624 TOPS

Architecture

ArchitectureAmpere

Compute Cores6,912

AI Accelerators432

Release Year2020

Power & Physical

TDP400 W

Max Power400 W

Form FactorSXM4

PCIe GenerationGen4

Interconnect

Multi-GPU SupportYes

Interconnect Bandwidth600 GB/s

Pricing

MSRP$10,000

Notes

Original A100

Models That May Fit on NVIDIA A100 SXM 40GB

Estimates based on INT8 quantization. Actual fit depends on framework and batch size.

Qwen 2.5 14B Instruct

Qwen · 14B

~16.8 GB

Llama 3.2 1B Instruct Qlora_int 4_EO 8

Meta AI · 1B

~1.2 GB

Falcon 3 3B Base

Technology Innovation Institute · 3B

~3.6 GB

Qwen 2.5 Coder 7B Instruct AWQ

Qwen · 7B

~8.4 GB

Browse all AI models

Related Tools

GPU Sizing Calculator

Calculate VRAM requirements for models

Browse All GPUs

Compare datacenter GPU specifications

AI Model Catalog

Browse and compare AI models

Added Jan 25, 2026

Last updated: Jan 25, 2026

Build Your AI Stack with Confidence

Explore models, compare pricing and benchmarks, and right-size your infrastructure — all in one place.

Get Started Compare Models