1 3 5

SeanWang0027 PRO

SeanWang0027

https://haojinw0027.github.io/

AI & ML interests

LLM Post-Training

Recent Activity

published a model 2 days ago

SeanWang0027/rl_warm_up_mixed_minesweeper_correct_thinking-parquet_qwen3-1.7b_epoch_3_mask_k4096

updated a model 2 days ago

SeanWang0027/rl_warm_up_mixed_minesweeper_correct_thinking-parquet_qwen3-1.7b_epoch_3_mask_k4096

published a model 3 days ago

SeanWang0027/rl_warm_up_mixed_minesweeper_correct_thinking-parquet_qwen3-1.7b_epoch_3_mask

View all activity

Organizations

Collections 3

View 3 collections

Papers 2

arxiv:2602.01058

arxiv:2505.16964

models 49

SeanWang0027/rl_warm_up_mixed_minesweeper_correct_thinking-parquet_qwen3-1.7b_epoch_3_mask_k4096

Updated 2 days ago

SeanWang0027/rl_warm_up_mixed_minesweeper_correct_thinking-parquet_qwen3-1.7b_epoch_3_mask

Updated 3 days ago

SeanWang0027/rl_warm_up_mixed_minesweeper_correct-parquet_qwen3-1.7b_epoch_3_mask

2B • Updated 10 days ago • 17

View 49 models

datasets 32

SeanWang0027/math_pope_mix_1018

Viewer • Updated 4 days ago • 1.02k • 29

SeanWang0027/sft_full_math_hard_9000

Viewer • Updated 4 days ago • 9k • 30

SeanWang0027/extreme_hard_4B

Viewer • Updated 6 days ago • 5.27k • 25

SeanWang0027/rlve_mixed_20envs_stitch_full

Viewer • Updated 24 days ago • 16k • 42

SeanWang0027/verl_mask_training

Updated 25 days ago • 94

SeanWang0027/rlve_30b_qwen_1.7b_mixed_20envs_10

Viewer • Updated 26 days ago • 16k • 25

SeanWang0027/teacher_prefix_sudoku_10K_sequential_qwen3_4b_thinking_continual_nemotron-cascade-8b

Updated 28 days ago • 62

SeanWang0027/student_prefix_sequential

Viewer • Updated about 1 month ago • 3k • 80 • 1

SeanWang0027/RAGEN

Updated Apr 11 • 494

SeanWang0027/mixed_sdft_solution_sequential_minesweeper_kukurasu_qwen3_4b_thinking

Updated Apr 9 • 43

View 32 datasets

SeanWang0027 PRO

AI & ML interests

Recent Activity

Organizations

Collections 3

SeanWang0027/olmo-7b-synlogic-sudoku-easy-grpo

SeanWang0027/olmo-7b-synlogic-sudoku-easy-hard-grpo

SeanWang0027/olmo-7b-synlogic-survo-sft

SeanWang0027/olmo-7b-synlogic-survo-space_reasoning-sft

SeanWang0027/olmo-7b-synlogic-survo-space_reasoning-math_path-sft

SeanWang0027/sci-10k-olmo-7b-synlogic-survo-space_reasoning-math_path-sft

SeanWang0027/olmo-7b-synlogic-sudoku-easy-grpo

SeanWang0027/olmo-7b-synlogic-sudoku-easy-hard-grpo

SeanWang0027/olmo-7b-synlogic-survo-sft

SeanWang0027/olmo-7b-synlogic-survo-space_reasoning-sft

SeanWang0027/olmo-7b-synlogic-survo-space_reasoning-math_path-sft

SeanWang0027/sci-10k-olmo-7b-synlogic-survo-space_reasoning-math_path-sft

Papers 2

models 49

SeanWang0027/rl_warm_up_mixed_minesweeper_correct_thinking-parquet_qwen3-1.7b_epoch_3_mask_k4096

SeanWang0027/rl_warm_up_mixed_minesweeper_correct_thinking-parquet_qwen3-1.7b_epoch_3_mask

SeanWang0027/grpo_minesweeper

SeanWang0027/rl_warm_up_stitch

SeanWang0027/rl_warm_up_mixed

SeanWang0027/grpo_kukurasu

SeanWang0027/sdft_experiment

SeanWang0027/rl_warm_up_stitch_survo-parquet_qwen3-1.7b_epoch_3_mask

SeanWang0027/rl_warm_up_mixed_survo_correct-parquet_qwen3-1.7b_epoch_3_mask

SeanWang0027/rl_warm_up_mixed_minesweeper_correct-parquet_qwen3-1.7b_epoch_3_mask

datasets 32

SeanWang0027/math_pope_mix_1018

SeanWang0027/sft_full_math_hard_9000

SeanWang0027/extreme_hard_4B

SeanWang0027/rlve_mixed_20envs_stitch_full

SeanWang0027/verl_mask_training

SeanWang0027/rlve_30b_qwen_1.7b_mixed_20envs_10

SeanWang0027/teacher_prefix_sudoku_10K_sequential_qwen3_4b_thinking_continual_nemotron-cascade-8b

SeanWang0027/student_prefix_sequential

SeanWang0027/RAGEN

SeanWang0027/mixed_sdft_solution_sequential_minesweeper_kukurasu_qwen3_4b_thinking

SeanWang0027 PRO

AI & ML interests

Recent Activity

Organizations

Collections 3

Papers 2

models 49 Sort: Recently updated

datasets 32 Sort: Recently updated

models 49

datasets 32