Humanlearning's picture
Add SFT and GRPO run commands to README
543a845 verified