Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
groupfairnessllm 's Collections
Tulu3 with distraction mitigation data
FiSCo: Evaluating LLM's Group Level Fairness

Tulu3 with distraction mitigation data

updated Oct 30

LLM and LRM can be easily distracted by hidden instructions or irrelevant tasks. We curated SFT and DPO data that model can finetune to avoid distract

Upvote
2

  • groupfairnessllm/tulu-3-preference-data-with-distraction

    Viewer • Updated Oct 27 • 1.5k • 45

  • groupfairnessllm/tulu-3-sft-with-distraction

    Viewer • Updated Oct 27 • 5.1k • 49 • 2

  • Distractor Injection Attacks on Large Reasoning Models: Characterization and Defense

    Paper • 2510.16259 • Published Oct 17 • 3

  • allenai/tulu-3-sft-personas-instruction-following

    Viewer • Updated Nov 21, 2024 • 30k • 2.12k • 57

  • allenai/llama-3.1-tulu-3-8b-preference-mixture

    Viewer • Updated Feb 4 • 273k • 1.66k • 24
Upvote
2
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs