Voxtral Mini 3B (MLX, 8-bit)

8-bit quantized MLX weights for Mistral's Voxtral Mini speech-to-text model, optimized for Apple Silicon inference. Recommended for most users — best balance of quality and download size.

Voxtral Mini is built on Ministral 3B with state-of-the-art audio understanding capabilities. It supports transcription, translation, Q&A, summarization, and function calling directly from audio input.

Features

  • 8 languages: English, Spanish, French, Portuguese, Hindi, German, Dutch, Italian
  • Long-form audio: up to 30 min transcription, 40 min understanding
  • 32k token context
  • Voice-triggered function calling

Specifications

Property Value
Total Parameters 4.68B (8-bit quantized)
Precision 8-bit
Download Size 5 GB
License Apache 2.0

Usage

Default model for Mac X — on-device speech transcription on Apple Silicon via MLX.

See also: Full precision (bfloat16) (17.4 GB) | 4-bit mixed quantized (3.2 GB)

License

Apache 2.0 — original model by Mistral AI.

Downloads last month
21
Safetensors
Model size
5B params
Tensor type
F16
·
U32
·
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Aayush9029/voxtral-mini-3b-8bit

Quantized
(18)
this model