Mistral Small 3.1 24B
Mistral Small 3.1 24B is one of the best models you can run on a single 24 GB consumer GPU. A dense 24B parameter model that fits entirely on an RTX 3090/4090 at Q4_K_M (~13 GB) with room for long context. Includes vision encoder, function …
24.0B
Parameters
128K
Max Context
Dense
Architecture
Mar 17, 2025
Released
Text + Vision
Modality
About Mistral Small 3.1 24B
Mistral Small 3.1 24B is one of the best models you can run on a single 24 GB consumer GPU. A dense 24B parameter model that fits entirely on an RTX 3090/4090 at Q4_K_M (~13 GB) with room for long context. Includes vision encoder, function calling, and strong multilingual performance. Apache 2.0 licensed. This is the go-to recommendation for anyone with a 24 GB GPU who wants maximum quality without offloading.
Technical Specifications
System Requirements
Estimated VRAM at 10% overhead for different quantization methods and context sizes.
| Quantization | 1K ctx | 128K ctx |
|---|---|---|
Q4_K_M0.50 B/W ~97% of FP16 | 12.62Consumer GPU | 40.41Datacenter GPU |
Q8_01.00 B/W ~100% of FP16 | 25.03Datacenter GPU | 52.81Datacenter GPU |
F162.00 B/W Reference | 49.84Datacenter GPU | 77.62Datacenter GPU |
Other Mistral Models
View AllFind the right GPU for Mistral Small 3.1 24B
Use the interactive VRAM Calculator to see exactly how much memory you need at any quantization level, context length, and overhead setting.