MagicQuant GGUF Hybrids - Qwen3 30B A3B Instruct 2507

MagicQuant is an automated quantization, benchmarking, and evolutionary hybrid-GGUF search system for LLMs.

Each release includes models optimized to outperform standard baseline quants (Q8, Q6, Q5, Q4). If a baseline GGUF exists in this repo, the evolutionary engine couldn’t beat it. If a baseline is missing, it’s because a hybrid configuration outperformed it so completely that including the baseline would've been pointless.

These hybrid GGUFs are built to be as small, fast, and low-drift as possible while preserving model capability.

To dive deeper into how MagicQuant works, see the main repo: MagicQuant on GitHub (by MagicCodingMan)

Notes:

  • The HuggingFace hardware compatibility where it shows the bits is usually wrong. It doesn't understand hybrid mixes, so don't trust it.
  • Naming scheme can be found on the MagicQuant Wiki.
  • (tips) Less precision loss means less brain damage. More TPS means faster! Smaller is always better right?

Precision Loss Guide

  • 0–0.1% → God-tier, scientifically exact
  • 0.1–1% → True near-lossless, agent-ready
  • 1–3% → Minimal loss, great for personal use
  • 3–5% → Borderline, but still functional
  • 5%+ → Toys, not tools, outside MagicQuant’s scope

Learn more about precision loss here.

Table - File Size + TPS + Avg Precision Loss

model_name file_size_gb bench_tps avg_prec_loss
iq4_nl-EHQKOUD-Q8_0 30.25 99.68 0.0771%
Q5_K 20.23 117.37 0.2007%
mxfp4_moe-H-B16-EUR-IQ4NL-KO-Q5K-QD-Q6K 18.93 110.54 0.3929%
IQ4_NL 16.26 138.69 0.4198%
iq4_nl-EHQKOUD-IQ4NL 16.04 149.76 2.6323%

Table - PPL Columns

model_name gen gen_er code code_er math math_er
iq4_nl-EHQKOUD-Q8_0 6.2536 0.1277 1.2991 0.0072 5.7045 0.1063
Q5_K 6.2777 0.1283 1.3006 0.0073 5.7037 0.1062
mxfp4_moe-H-B16-EUR-IQ4NL-KO-Q5K-QD-Q6K 6.2854 0.1284 1.3036 0.0072 5.7274 0.1068
IQ4_NL 6.2669 0.1274 1.3111 0.0073 5.7159 0.1061
iq4_nl-EHQKOUD-IQ4NL 6.4836 0.1337 1.3170 0.0075 5.8712 0.1099

Table - Precision Loss Columns

model_name loss_general loss_code loss_math
iq4_nl-EHQKOUD-Q8_0 0.0719 0.0770 0.0823
Q5_K 0.3132 0.1926 0.0963
mxfp4_moe-H-B16-EUR-IQ4NL-KO-Q5K-QD-Q6K 0.4362 0.4237 0.3188
IQ4_NL 0.1406 1.0015 0.1174
iq4_nl-EHQKOUD-IQ4NL 3.6033 1.4560 2.8375

Baseline Models (Reference)

Table - File Size + TPS + Avg Precision Loss

model_name file_size_gb bench_tps avg_prec_loss
BF16 56.90 44.48 0.0000%
Q8_0 30.25 95.03 0.0771%
Q5_K 20.23 117.37 0.2007%
Q6_K 23.37 108.10 0.3089%
IQ4_NL 16.26 138.69 0.4198%
Q4_K_M 17.28 132.46 1.4766%
MXFP4_MOE 15.15 138.34 9.0818%

Table - PPL Columns

model_name gen gen_er code code_er math math_er
BF16 6.2581 0.1279 1.2981 0.0072 5.7092 0.1064
Q8_0 6.2536 0.1277 1.2991 0.0072 5.7045 0.1063
Q5_K 6.2777 0.1283 1.3006 0.0073 5.7037 0.1062
Q6_K 6.2881 0.1290 1.3002 0.0072 5.7255 0.1072
IQ4_NL 6.2669 0.1274 1.3111 0.0073 5.7159 0.1061
Q4_K_M 6.4032 0.1315 1.3145 0.0074 5.7576 0.1073
MXFP4_MOE 7.0161 0.1472 1.3631 0.0083 6.2873 0.1213

Table - Precision Loss Columns

model_name loss_general loss_code loss_math
BF16 0.0000 0.0000 0.0000
Q8_0 0.0719 0.0770 0.0823
Q5_K 0.3132 0.1926 0.0963
Q6_K 0.4794 0.1618 0.2855
IQ4_NL 0.1406 1.0015 0.1174
Q4_K_M 2.3186 1.2634 0.8478
MXFP4_MOE 12.1123 5.0073 10.1258

Support

I’m a solo developer working full time for myself to achieve my dream, pouring nights and weekends into open protocols and tools that I hope make the world a little better. If you chip in, you're helping me keep the lights on while I keep shipping.

Click here to see ways to support - BTC, Paypal, GitHub sponsors.

Or, just drop a like on the repo :)

Downloads last month
269
GGUF
Model size
31B params
Architecture
qwen3moe
Hardware compatibility
Log In to view the estimation

4-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for magiccodingman/Qwen3-30B-A3B-Instruct-2507-unsloth-MagicQuant-Hybrid-GGUF

Quantized
(3)
this model

Collection including magiccodingman/Qwen3-30B-A3B-Instruct-2507-unsloth-MagicQuant-Hybrid-GGUF