view article Article Custom Policy Enforcement with Reasoning: Faster, Safer AI Applications 4 days ago • 15
Ministral 3 - Additional Checkpoints Collection Different formats and Quantized versions of our Ministral 3 family; 14B/8B/3B Instruct/Reasoning GGUF, 3B Instruct ONNX and 14B/8B/3B Instruct BF16. • 13 items • Updated 4 days ago • 11
view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand 2 days ago • 38
Mistral Large 3 Collection A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated 4 days ago • 69
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated 4 days ago • 110
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 6 days ago • 223
CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning Paper • 2511.18659 • Published 13 days ago • 11
view article Article We’re open-sourcing our text-to-image model and the process behind it 24 days ago • 73
Olmo 3 Pre-training Collection All artifacts related to Olmo 3 pre-training • 10 items • Updated 7 days ago • 26
Olmo 3 Post-training Collection All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them. • 32 items • Updated 5 days ago • 37
MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation Paper • 2511.09611 • Published 24 days ago • 68