The L40S is optimized for AI inference and generative AI with 48GB GDDR6 and Ada Lovelace architecture. Combines strong AI performance with graphics capabilities for diverse workloads.
VRAM
48 GB
Memory
GDDR6
Bandwidth
864 GB/s
TDP
350W
Medium Language Models
Inference for models up to 70B parameters
Enterprise Deployment
Designed for 24/7 datacenter operations
AI inference and graphics
Estimates based on INT8 quantization. Actual fit depends on framework and batch size.
Added Jan 25, 2026
Last updated: Jan 25, 2026
Explore models, compare pricing and benchmarks, and right-size your infrastructure — all in one place.