RyanYr/pg_sais-dapo_shuffled-offline-grpo_qwen2.5-math-1.5B_kl_bl0_matheval Updated 27 days ago • 148
RyanYr/pg-dapo_shuffled-01_offline-grpo_qwen2.5-math-1.5B_piref_nokl_matheval Updated 27 days ago • 360
RyanYr/pg-dapo_shuffled-0_offline-grpo_qwen2.5-math-1.5B_piref_nokl_matheval Updated 27 days ago • 149
RyanYr/pg-dapo_shuffled-01_offline-grpo_qwen2.5-math-1.5B_piref_kl_behavior_matheval Updated 27 days ago • 170
RyanYr/pg_trajis-dapo_shuffled-offline-grpo_qwen2.5-math-1.5B_piref_matheval Updated 27 days ago • 150
RyanYr/pg-dapo_shuffled-01_offline-grpo_qwen2.5-math-1.5B_piref_kl_matheval Updated 27 days ago • 162
RyanYr/pg-dapo_shuffled-0_offline-grpo_qwen2.5-math-1.5B_piref_kl_behavior_matheval Updated 27 days ago • 366
RyanYr/pg-dapo_shuffled-10_offline-grpo_qwen2.5-math-1.5B_piref_nokl_matheval Viewer • Updated 28 days ago • 1.55k • 41
RyanYr/pg-dapo_shuffled-10_offline-grpo_qwen2.5-math-1.5B_piref_kl_matheval Viewer • Updated 28 days ago • 1.55k • 40
RyanYr/pg-dapo_shuffled-10_offline-grpo_qwen2.5-math-1.5B_piref_kl_behavior_matheval Viewer • Updated 28 days ago • 1.55k • 22
RyanYr/pg-dapo_shuffled-10_offline-grpo_qwen2.5-math-1.5B_kl_behavior_matheval Updated 29 days ago • 491
RyanYr/pg-dapo_shuffled-01_offline-grpo_qwen2.5-math-1.5B_kl_behavior_matheval Updated 29 days ago • 896
RyanYr/pg-dapo_shuffled-0_offline-grpo_qwen2.5-math-1.5B_kl_behavior_matheval Updated 29 days ago • 1.21k
RyanYr/pg-dapo_shuffled-01_offline-pg-dapo-qwen3-4B-Base-mbs128-n4_kl_matheval Viewer • Updated 30 days ago • 18.6k • 484