int-prediction/lirpg-lora-intrinsic-fullparam-qwen2-5-math-7b-4000-40k-spk-a01-rin05-rex10-lrin5e-6-rank16-hand Updated 17 days ago
int-prediction/lirpg-fullparam-qwen2-5-math-7b-handrolled-zeroinit-token-grpo Updated about 1 month ago