Dual-slot PCIe form factor of Intel's third-generation Gaudi accelerator, with 128 GB HBM2e and 24 integrated 200 GbE ports for scale-out via standard Ethernet (no proprietary fabric needed). Targets cost-conscious LLM inference and fine-tuning.
VRAM
128 GB
Memory
HBM2e
Bandwidth
3700 GB/s
TDP
600W
Large Language Models
Training and inference for models like GPT-4, Llama 70B+
Deep Learning Training
High-performance training for neural networks
Distributed Training
Multi-node training with fast interconnects
High-Throughput Inference
Optimized for batched inference workloads
compute_cores reflects 64 TPC (Tensor Processor Cores); ai_accelerators is the count of dedicated MME (Matrix Multiplication Engines). Compute figures are dense (no sparsity).
Estimates based on INT8 quantization. Actual fit depends on framework and batch size.
Added Apr 30, 2026
Last updated: Apr 30, 2026
From model selection to production, one platform, no fragmentation.