I like how explanation for GRPO is just a giant formula wth no explanation of what it does
Ivan Nikishev
dpe1
AI & ML interests
he he he
Recent Activity
commentedon an article about 8 hours ago
DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge new activity about 18 hours ago
HuggingFaceTB/nanowhale-100m:Nice new activity 13 days ago
arnir0/Tiny-LLM:tiny-llm