The L4 is a low-profile, energy-efficient inference accelerator with 24GB GDDR6 drawing just 72W. Ideal for high-density inference deployments in space and power-constrained environments.
VRAM
24 GB
Memory
GDDR6
Bandwidth
300 GB/s
TDP
72W
Smaller Language Models
Inference for 7B-13B parameter models
Enterprise Deployment
Designed for 24/7 datacenter operations
Low-power inference
Estimates based on INT8 quantization. Actual fit depends on framework and batch size.
Added Jan 25, 2026
Last updated: Jan 25, 2026
Explore models, compare pricing and benchmarks, and right-size your infrastructure — all in one place.