Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation
Paper
•
2601.22813
•
Published
•
48
None defined yet.
Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation
WUSH: Near-Optimal Adaptive Transforms for LLM Quantization