Error related to n_moe_layer_step hyper parameter when using this model

#1
by kamathln - opened

I have been playing with several models (at least 20 extremely varies models till now). Have never come across a model before that resulted in an error other than not having enough memory.

GGML_ASSERT(hparams.n_moe_layer_step > 0 && "Llama 4 requires n_moe_layer_step > 0") failed

Using the Vulkan driver, if it matters.

Command line:

 llama-server    --host 0.0.0.0 -hf DarqueDante/MobileLLM-R1-140M-Q4_0-GGUF:Q4_0

because the original model is "Llama4" based llama.cpp seems to be expecting it to be MoE model when infact it is a dense model.

Oh interesting! Thanks a lot for clarifying.

kamathln changed discussion status to closed

Sign up or log in to comment