Qwen2.5-0.5B-Instruct (MLX, 8-bit)
This repository contains an MLX-converted and 8-bit quantized version of Qwen/Qwen2.5-0.5B-Instruct.
- No fine-tuning or training was performed
- Format conversion + post-training quantization only
- 8-bit prioritizes output stability and quality
Usage
pip install -U mlx-lm
mlx_lm.generate \
--model Irfanuruchi/Qwen2.5-0.5B-Instruct-MLX-8bit \
--prompt "Write a helpful onboarding message for an iOS app in 3 bullet points."
Bench notes (MacBook Pro M3 Pro)
- Prompt tokens: 45
- Generation tokens: 100
- Generation speed: ~192.9 tokens/sec
- Peak memory: ~0.565 GB
Tooling
- mlx-lm: 0.30.2
- mlx: bundled with Apple MLX (no public version string)
Related models
- 4-bit variant (recommended default):
https://huggingface.co/Irfanuruchi/Qwen2.5-0.5B-Instruct-MLX-4bit
- Downloads last month
- 12
Model size
0.1B params
Tensor type
BF16
·
U32 ·
Hardware compatibility
Log In to add your hardware
8-bit