VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse
Paper
•
2512.14531
•
Published
•
12
Artificial Intelligence
VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse
ROOT: Robust Orthogonalized Optimizer for Neural Network Training