HIGGS-per-tensor - a inference-optimization Collection

inference-optimization 's Collections

HIGGS-per-tensor

Granite 4 Small and Tiny Quantized Models

NVIDIA-Nemotron-3-Nano-30B-A3B Quantized Models

Qwen3-Next-80B-A3B Quantized Models

KV Cache Quantization

HIGGS-per-tensor

updated about 17 hours ago