P1: Mastering Physics Olympiads with Reinforcement Learning Paper β’ 2511.13612 β’ Published Nov 17, 2025 β’ 134
view article Article Introducing smolagents: simple agents that write actions in code. +1 Dec 31, 2024 β’ 1.16k
Kimi-K2 Collection Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence β’ 5 items β’ Updated 2 days ago β’ 165
view article Article π€ππ¬π₯οΈπ Kimi-VL-A3B-Thinking-2506: A Quick Navigation Jun 21, 2025 β’ 75
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper β’ 2506.13585 β’ Published Jun 16, 2025 β’ 273
AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale Paper β’ 2505.08311 β’ Published May 13, 2025 β’ 19
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset Paper β’ 2504.16891 β’ Published Apr 23, 2025 β’ 25
Kimi k1.5: Scaling Reinforcement Learning with LLMs Paper β’ 2501.12599 β’ Published Jan 22, 2025 β’ 126
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper β’ 2411.16489 β’ Published Nov 25, 2024 β’ 45
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper β’ 2412.06559 β’ Published Dec 9, 2024 β’ 86
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator Paper β’ 2412.12094 β’ Published Dec 16, 2024 β’ 11
Qwen2-Math Collection Math-specific model series based on Qwen2 β’ 8 items β’ Updated 29 days ago β’ 52
NuminaMath Collection Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize β’ 7 items β’ Updated Feb 10, 2025 β’ 79
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies Paper β’ 2407.13623 β’ Published Jul 18, 2024 β’ 56
view article Article RegMix: Data Mixture as Regression for Language Model Pre-training Jul 11, 2024 β’ 15