Longhui98 (Longhui Yu)

upvoted a paper 2 months ago

P1: Mastering Physics Olympiads with Reinforcement Learning

Paper • 2511.13612 • Published Nov 17, 2025 • 134

upvoted a paper 6 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 316

upvoted an article 7 months ago

Article

Introducing smolagents: simple agents that write actions in code.

+1

Dec 31, 2024

•

1.16k

upvoted a collection 7 months ago

Kimi-K2

Collection

Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 5 items • Updated 2 days ago • 165

upvoted an article 7 months ago

Article

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation

Jun 21, 2025

•

75

upvoted a paper 7 months ago

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16, 2025 • 273

upvoted 3 papers 9 months ago

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 324

AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale

Paper • 2505.08311 • Published May 13, 2025 • 19

AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset

Paper • 2504.16891 • Published Apr 23, 2025 • 25

upvoted a paper 10 months ago

Kimi-VL Technical Report

Paper • 2504.07491 • Published Apr 10, 2025 • 134

upvoted 5 papers about 1 year ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22, 2025 • 126

O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?

Paper • 2411.16489 • Published Nov 25, 2024 • 45

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 86

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 376

SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Paper • 2412.12094 • Published Dec 16, 2024 • 11

upvoted 2 collections over 1 year ago

Qwen2-Math

Collection

Math-specific model series based on Qwen2 • 8 items • Updated 29 days ago • 52

NuminaMath

Collection

Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize • 7 items • Updated Feb 10, 2025 • 79

upvoted a paper over 1 year ago

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published Jul 18, 2024 • 56

upvoted 2 articles over 1 year ago

Article

RegMix: Data Mixture as Regression for Language Model Pre-training

Jul 11, 2024

•

15

Article

How NuminaMath Won the 1st AIMO Progress Prize

+6

Jul 11, 2024

•

125

Longhui Yu

AI & ML interests

Organizations

P1: Mastering Physics Olympiads with Reinforcement Learning

Group Sequence Policy Optimization

Introducing smolagents: simple agents that write actions in code.

Kimi-K2

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Qwen3 Technical Report

AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale

AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset

Kimi-VL Technical Report

Kimi k1.5: Scaling Reinforcement Learning with LLMs

O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Qwen2.5 Technical Report

SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Qwen2-Math

NuminaMath

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

RegMix: Data Mixture as Regression for Language Model Pre-training

How NuminaMath Won the 1st AIMO Progress Prize

Longhui Yu

AI & ML interests

Organizations

Longhui98's activity

Introducing smolagents: simple agents that write actions in code.

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation

RegMix: Data Mixture as Regression for Language Model Pre-training

How NuminaMath Won the 1st AIMO Progress Prize