arxiv:2402.17139
Sherry Yang
sherryy
AI & ML interests
None yet
Organizations
None yet
models 10
sherryy/Qwen2-0.5B-GRPO-test
Updated
sherryy/best5-next10-nopizza-nonomad_sft_90
Text Generation • 8B • Updated
sherryy/pizza_rwr_2k-1k
Text Generation • 8B • Updated • 1
sherryy/pizza_rwr_k10_iter1
Text Generation • 8B • Updated • 2
sherryy/pizza_rwr_iter1
Text Generation • 8B • Updated • 1
sherryy/pizza_rwr_k10
Text Generation • 8B • Updated • 1
sherryy/pizza_rwr
Text Generation • 8B • Updated • 1
sherryy/pizza_sft_90
Text Generation • 8B • Updated • 1
sherryy/pizza_sft
Text Generation • 8B • Updated
sherryy/math-baseline
Text Generation • 8B • Updated • 1
datasets 14
sherryy/best5-next10-nopizza-nonomad_sft_90
Viewer • Updated • 78.6k • 15
sherryy/pizza_rwr_k10_iter1
Viewer • Updated • 24.4k • 18
sherryy/pizza_rwr_iter1
Viewer • Updated • 42.4k • 7
sherryy/pizza_rwr
Viewer • Updated • 83k • 11
sherryy/tree_dataset
Viewer • Updated • 11.1k • 9
sherryy/pizza_sft
Viewer • Updated • 37.8k • 12
sherryy/pizza_dpo
Viewer • Updated • 5.61k • 4
sherryy/math12k
Viewer • Updated • 12.5k • 48
sherryy/random-acts-of-pizza
Viewer • Updated • 59.5k • 174
sherryy/test_data
Updated • 6