LLaMA-3.1-8B SFT (Prompt Masking)

Fine-tuned LLaMA-3.1-8B using SFT instruction tuning with prompt masking (loss computed only on response tokens).

Training Details

  • Base Model: meta-llama/Llama-3.1-8B
  • Dataset: UltraChat-200K + SafetyLlama (~200K examples)
  • Training: 1 epoch (6326 steps)
  • Prompt Masking: Enabled (loss on response tokens only)

Evaluation Results

Benchmark Baseline This Model
GSM8K 16.4% 32.7%
MMLU 58.1% 58.2%
SST Safety 62.0% 77.0%
AlpacaEval 1.57% 4.5%

Files

  • eval_baseline/: Baseline evaluation results (pre-finetuning Llama-3.1-8B)

Reference

Part of CS336 Assignment 5 (SFT Instruction Tuning). See building-from-scratch/sft for details.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for garg-aayush/llama31-8b-sft-mask

Finetuned
(1656)
this model

Dataset used to train garg-aayush/llama31-8b-sft-mask

Evaluation results