Nemotron-Nano 9B v2
Nemotron-Nano 9B v2 is NVIDIA's unified reasoning/non-reasoning model. It can toggle between fast standard generation and extended chain-of-thought reasoning, making it versatile for both interactive chat and structured problem-solving. At …
9.0B
Parameters
128K
Max Context
Dense
Architecture
Nov 18, 2025
Released
Text
Modality
About Nemotron-Nano 9B v2
Nemotron-Nano 9B v2 is NVIDIA's unified reasoning/non-reasoning model. It can toggle between fast standard generation and extended chain-of-thought reasoning, making it versatile for both interactive chat and structured problem-solving. At 9B parameters with 128K context and competitive benchmark scores, it is a strong alternative to Qwen 3 8B. The NVIDIA Open Model License permits commercial use with some restrictions.
Technical Specifications
System Requirements
Estimated VRAM at 10% overhead for different quantization methods and context sizes.
| Quantization | 1K ctx | 128K ctx |
|---|---|---|
Q4_K_M0.50 B/W ~97% of FP16 | 4.81Consumer GPU | 24.65Datacenter GPU |
Q8_01.00 B/W ~100% of FP16 | 9.46Consumer GPU | 29.30Datacenter GPU |
F162.00 B/W Reference | 18.76Consumer GPU | 38.61Datacenter GPU |
Other Nvidia Models
View AllFind the right GPU for Nemotron-Nano 9B v2
Use the interactive VRAM Calculator to see exactly how much memory you need at any quantization level, context length, and overhead setting.