NVIDIA

Nvidia Nemotron 3 Nano 30B A3B BF16

Name: Nvidia Nemotron 3 Nano 30B A3B BF16
Author: NVIDIA

NVIDIA develops the Nvidia Nemotron 3 Nano 30B A 3B FP 8, a large language model designed for unified reasoning and non-reasoning tasks, exceling in math-related tasks as evidenced by its top 25% Math Index score. It features a hybrid Mixture-of-Experts architecture with 23 MoE layers and 6 Attention layers, allowing for efficient processing of long context windows up to 262,144 tokens.

Input

Output

Context

262K

Max Output

20K

Parameters

31.6B

Technical Specifications

Model TypeChat

Context Window262,144 tokens

Max Output Tokens20,480 tokens

Parameters31.6B

Release DateDec 4, 2025

Training CutoffNov 28, 2025

Licenseother

Open SourceYes

Input Modalities

Text

Output Modalities

Text

Capabilities

Benchmarks

Artificial Analysis

10.2%

HLE

75.7%

GPQA

29.6%

SciCode

79.4%

MMLU Pro

91.0

Math

19.0

Coding

74.1%

LiveCodeBench

17.5

Intelligence

51.8

Speed (tok/s)

42.9

TTFA (s)

4.2

TTFT (s)

Resources & Links

HuggingFace

Model card on HuggingFace

Estimated GPU Requirements for Nvidia Nemotron 3 Nano 30B A3B BF16

Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.

NVIDIA A100 SXM 40GB

40 GB VRAM

90% used

NVIDIA L40

48 GB VRAM

75% used

NVIDIA L40S

48 GB VRAM

75% used

AMD Instinct MI210

64 GB VRAM

57% used

Use GPU Sizing Calculator for custom configurations

Browse More Models

Related Tools

Compare This Model

Compare this model against top alternatives

Browse All Models

Explore other models in the catalog

Data sourced from official provider APIs and documentation

Last updated: Jun 24, 2026

Start building with the right model.

Automatically route workloads to the right model for every task, every time.

Start Building Read the docs

Inferbase

Back to Models

NVIDIA

Nvidia Nemotron 3 Nano 30B A3B BF16

Try in Playground Add to Compare

Input

Output

Context

262K

Max Output

20K

Parameters

31.6B

Technical Specifications

Model TypeChat

Context Window262,144 tokens

Max Output Tokens20,480 tokens

Parameters31.6B

Release DateDec 4, 2025

Training CutoffNov 28, 2025

Licenseother

Open SourceYes

Input Modalities

Text

Output Modalities

Text

Capabilities

Benchmarks

Artificial Analysis

10.2%

HLE

75.7%

GPQA

29.6%

SciCode

79.4%

MMLU Pro

91.0

Math

19.0

Coding

74.1%

LiveCodeBench

17.5

Intelligence

51.8

Speed (tok/s)

42.9

TTFA (s)

4.2

TTFT (s)

Resources & Links

HuggingFace

Model card on HuggingFace

Estimated GPU Requirements for Nvidia Nemotron 3 Nano 30B A3B BF16

Estimates based on INT8 quantization. Actual requirements vary by framework and configuration.

NVIDIA A100 SXM 40GB

40 GB VRAM

90% used

NVIDIA L40

48 GB VRAM

75% used

NVIDIA L40S

48 GB VRAM

75% used

AMD Instinct MI210

64 GB VRAM

57% used

Use GPU Sizing Calculator for custom configurations

Browse More Models

Related Tools

Compare This Model

Compare this model against top alternatives

Browse All Models

Explore other models in the catalog

Data sourced from official provider APIs and documentation

Last updated: Jun 24, 2026

Start building with the right model.

Automatically route workloads to the right model for every task, every time.

Start Building Read the docs