Mistral: Mixtral 8x22B (base)

Mixtral 8x22B is a large-scale language model from Mistral AI. It consists of 8 experts, each 22 billion parameters, with each token using 2 experts at a time.

It was released via X.

#moe

How Can I Help?
Type here to start conversation