Tulu3 with distraction mitigation data

groupfairnessllm 's Collections

FiSCo: Evaluating LLM's Group Level Fairness

updated Oct 30

LLM and LRM can be easily distracted by hidden instructions or irrelevant tasks. We curated SFT and DPO data that model can finetune to avoid distract

Upvote

groupfairnessllm/tulu-3-preference-data-with-distraction

Viewer • Updated Oct 27 • 1.5k • 45
groupfairnessllm/tulu-3-sft-with-distraction

Viewer • Updated Oct 27 • 5.1k • 49 • 2
Distractor Injection Attacks on Large Reasoning Models: Characterization and Defense

Paper • 2510.16259 • Published Oct 17 • 3
allenai/tulu-3-sft-personas-instruction-following

Viewer • Updated Nov 21, 2024 • 30k • 2.12k • 57
allenai/llama-3.1-tulu-3-8b-preference-mixture

Viewer • Updated Feb 4 • 273k • 1.66k • 24

Upvote

Collection guide
Browse collections