Qwen 2.5 32B
Qwen 2.5 32B is the sweet spot of the Qwen 2.5 family — delivering near-70B-class performance at roughly half the VRAM. At 32.5B parameters, Q4_K_M needs ~18 GB, fitting comfortably on 24 GB GPUs with room for long context. Strong across co…
32.5B
Parameters
128K
Max Context
Dense
Architecture
Sep 19, 2024
Released
Text
Modality
About Qwen 2.5 32B
Qwen 2.5 32B is the sweet spot of the Qwen 2.5 family — delivering near-70B-class performance at roughly half the VRAM. At 32.5B parameters, Q4_K_M needs ~18 GB, fitting comfortably on 24 GB GPUs with room for long context. Strong across coding (especially with the Coder variant), math, and multilingual tasks. Apache 2.0 licensed. For 24 GB GPU owners who want maximum dense-model quality without MoE overhead, this is one of the best options.
Technical Specifications
System Requirements
Estimated VRAM at 10% overhead for different quantization methods and context sizes.
| Quantization | 1K ctx | 128K ctx |
|---|---|---|
Q4_K_M0.50 B/W ~97% of FP16 | 17.05Consumer GPU | 48.80Datacenter GPU |
Q8_01.00 B/W ~100% of FP16 | 33.85Datacenter GPU | 65.60Datacenter GPU |
F162.00 B/W Reference | 67.44Datacenter GPU | 99.19Cluster / Multi-GPU |
Other Qwen Models
View AllFind the right GPU for Qwen 2.5 32B
Use the interactive VRAM Calculator to see exactly how much memory you need at any quantization level, context length, and overhead setting.