Fine-tuning roadmap

#18

by RonanMcGovern - opened Dec 30, 2024

Discussion

RonanMcGovern

Dec 30, 2024

What fine-tuning library is likely to first be able to support deepseek v3?

Transformers did not have v2 integrated.

MoE might take work, latent attention, and MTP. Then also supporting fp8 as a base model on which to train Loras…

Thanks, and thanks for the model.

GuoxiangZu

Dec 31, 2024

I also have the same question, how to fine-tune DeepSeek-V3 ? Could a guide be provided?

hu-po

Jan 2, 2025

+1 for finetuning script

Ooghry

Jan 3, 2025

This comment has been hidden

marufaytekin

Jan 25, 2025

Alessio-Azure

Oct 19, 2025

Hello, I am running deepseek V3 0324 on Nanogpt and Wanted to know if there is a fine tuning used in hugging face i can use for deepseek myself, since in the demo it acts basically the same model of deepseek originally while on Nanogpt it's a bit different, despite having no censorship or anything, Just raw model. Thank you.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment