MistralMoEApache 2.0

Mixtral 8x22B (MoE)

Mixtral 8x22B is the larger sibling of Mixtral 8x7B — 141B total parameters with 39B active per token. It delivers quality competitive with dense 70B models but requires significant VRAM (~75 GB at Q4_K_M). The Apache 2.0 license makes it a

141.0B

Parameters

39.0B

Active

64K

Max Context

MoE

Architecture

Apr 17, 2024

Released

Text

Modality

About Mixtral 8x22B (MoE)

Mixtral 8x22B is the larger sibling of Mixtral 8x7B — 141B total parameters with 39B active per token. It delivers quality competitive with dense 70B models but requires significant VRAM (~75 GB at Q4_K_M). The Apache 2.0 license makes it attractive for enterprise deployments, though most users will need server-class hardware or aggressive quantization. Supports 65K context.

EnterpriseCodeMultilingualCommercial

Technical Specifications

Total Parameters141.0B
Active Parameters39.0B per token
ArchitectureMixture of Experts
Total Experts8
Active Experts2 per token
Attention TypeGQA (Grouped Query Attention)
Hidden Dimensiond = 6,144
Transformer Layers56
Attention Heads48
KV Headsn_kv = 8
Head Dimensiond_head = 128
Activation FunctionSwiGLU
NormalizationRMSNorm
Position EmbeddingRoPE

System Requirements

Estimated VRAM at 10% overhead for different quantization methods and context sizes.

Quantization1K ctx64K ctx
Q4_K_M0.50 B/W
~97% of FP16
73.10Datacenter GPU
86.88Cluster / Multi-GPU
Q8_01.00 B/W
~100% of FP16
146.0Cluster / Multi-GPU
159.8Cluster / Multi-GPU
F162.00 B/W
Reference
291.7Cluster / Multi-GPU
305.5Cluster / Multi-GPU
Fits 24 GB consumer GPU
Fits 80 GB datacenter GPU
Requires cluster / multi-GPU

Other Mistral Models

View All

Find the right GPU for Mixtral 8x22B (MoE)

Use the interactive VRAM Calculator to see exactly how much memory you need at any quantization level, context length, and overhead setting.