Mistral releases Large 3 model with 41 billion active parameters
French AI lab challenges US dominance with a high-efficiency Mixture-of-Experts architecture.
Efficient power for enterprise
Paris-based Mistral AI continued its challenge to US-based hyperscalers on Tuesday with the release of Mistral Large 3, a new flagship model built on a Mixture-of-Experts (MoE) architecture. The model features 675 billion total parameters but activates only 41 billion per token during inference. This "sparse activation" strategy allows the model to deliver reasoning capabilities comparable to GPT-4 class systems while significantly reducing the computational cost for enterprise users.
The release also included the Ministral 3 family, a suite of smaller dense models (3B, 8B, and 14B parameters) designed for edge computing and local inference. …
Archive Access
This article is older than 24 hours. Create a free account to access our 7-day archive.