HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HR LLM Agent Paper • 2402.01018 • Published Feb 1, 2024 • 2
Tulu3 with distraction mitigation data Collection LLM and LRM can be easily distracted by hidden instructions or irrelevant tasks. We curated SFT and DPO data that model can finetune to avoid distract • 5 items • Updated Oct 30 • 2
groupfairnessllm/tulu-3-sft-personas-instruction-following-with-distraction Viewer • Updated Oct 21 • 1.7k • 36
groupfairnessllm/tulu-3-preference-personas-instruction-following-with-distraction Viewer • Updated Oct 21 • 500 • 24
Tulu3 with distraction mitigation data Collection LLM and LRM can be easily distracted by hidden instructions or irrelevant tasks. We curated SFT and DPO data that model can finetune to avoid distract • 5 items • Updated Oct 30 • 2
Distractor Injection Attacks on Large Reasoning Models: Characterization and Defense Paper • 2510.16259 • Published Oct 17 • 3
Tulu3 with distraction mitigation data Collection LLM and LRM can be easily distracted by hidden instructions or irrelevant tasks. We curated SFT and DPO data that model can finetune to avoid distract • 5 items • Updated Oct 30 • 2
Tulu3 with distraction mitigation data Collection LLM and LRM can be easily distracted by hidden instructions or irrelevant tasks. We curated SFT and DPO data that model can finetune to avoid distract • 5 items • Updated Oct 30 • 2
groupfairnessllm/tulu-3-preference-personas-instruction-following-with-distraction Viewer • Updated Oct 21 • 500 • 24