Beta

Inferbase

Beta

Inferbase

Beta

AMD•2025

AMD Instinct MI350X

Name: AMD Instinct MI350X
Brand: AMD

Datacenter

Air-cooled CDNA 4 accelerator with 288 GB of HBM3e and 8 TB/s of memory bandwidth, designed for retrofit into existing MI300/MI325 OAM platforms. New native FP4 and FP6 datatypes target inference cost-per-token reductions over Hopper-class hardware.

VRAM

288 GB

Memory

HBM3e

Bandwidth

8000 GB/s

TDP

1000W

Suitable Workloads

Large Language Models

Training and inference for models like GPT-4, Llama 70B+

Deep Learning Training

High-performance training for neural networks

Distributed Training

Multi-node training with fast interconnects

High-Throughput Inference

Optimized for batched inference workloads

Key Highlights

Built on CDNA 4 architecture
OAM form factor

Memory

VRAM288 GB

Memory TypeHBM3e

Memory Bandwidth8000 GB/s

Compute Performance

FP32 (Single Precision)230 TFLOPS

FP16 (Half Precision)1840 TFLOPS

BF161840 TFLOPS

INT83680 TOPS

Architecture

ArchitectureCDNA 4

Release Year2025

Power & Physical

TDP1000 W

Max Power1000 W

Form FactorOAM

PCIe GenerationGen5

Interconnect

Multi-GPU SupportYes

Interconnect Bandwidth1075 GB/s

Notes

Spec verification recommended: numbers reflect AMD's Advancing AI 2025 announcement; double-check against the published MI350X datasheet before relying for production sizing.

Models That May Fit on AMD Instinct MI350X

Estimates based on INT8 quantization. Actual fit depends on framework and batch size.

Deepseek V 2 Chat

DeepSeek · 236B

~239.3 GB

Deepseek V 2

DeepSeek · 236B

~239.3 GB

Nemotron 3 Nano 30B A3B (free)

NVIDIA · 30B

~34.5 GB

Codellama 34B Instruct Hf

Meta AI · 34B

~35.6 GB

Browse all AI models

Explore More

Inference API

Run models directly through our API with smart routing

Model Recommender

Describe your use case and get ranked recommendations

GPU Capacity Planner

Calculate VRAM and compute requirements for self-hosting

Added Apr 30, 2026

Last updated: Apr 30, 2026

Start building with the right model.

From model selection to production, one platform, no fragmentation.

Start Building Explore Models

Inferbase

Beta

Inferbase

Beta

AMD•2025

AMD Instinct MI350X

Datacenter

VRAM

288 GB

Memory

HBM3e

Bandwidth

8000 GB/s

TDP

1000W

Suitable Workloads

Large Language Models

Training and inference for models like GPT-4, Llama 70B+

Deep Learning Training

High-performance training for neural networks

Distributed Training

Multi-node training with fast interconnects

High-Throughput Inference

Optimized for batched inference workloads

Key Highlights

Built on CDNA 4 architecture
OAM form factor

Memory

VRAM288 GB

Memory TypeHBM3e

Memory Bandwidth8000 GB/s

Compute Performance

FP32 (Single Precision)230 TFLOPS

FP16 (Half Precision)1840 TFLOPS

BF161840 TFLOPS

INT83680 TOPS

Architecture

ArchitectureCDNA 4

Release Year2025

Power & Physical

TDP1000 W

Max Power1000 W

Form FactorOAM

PCIe GenerationGen5

Interconnect

Multi-GPU SupportYes

Interconnect Bandwidth1075 GB/s

Notes

Spec verification recommended: numbers reflect AMD's Advancing AI 2025 announcement; double-check against the published MI350X datasheet before relying for production sizing.