The original A100 introduced the Ampere architecture with 3rd-gen Tensor Cores and 40GB HBM2e memory. Widely deployed across major cloud providers for AI and scientific computing.
VRAM
40 GB
Memory
HBM2e
Bandwidth
1555 GB/s
TDP
400W
Medium Language Models
Inference for models up to 70B parameters
Distributed Training
Multi-node training with fast interconnects
Enterprise Deployment
Designed for 24/7 datacenter operations
Original A100
Estimates based on INT8 quantization. Actual fit depends on framework and batch size.
Added Jan 25, 2026
Last updated: Jan 25, 2026
Explore models, compare pricing and benchmarks, and right-size your infrastructure — all in one place.