7 13 2

Jiang

Dongwei

Some-random

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 months ago

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

upvoted a paper 4 months ago

IA2: Alignment with ICL Activations Improves Supervised Fine-Tuning

upvoted an article 5 months ago

SmolLM3: smol, multilingual, long-context reasoner

View all activity

Organizations

upvoted 2 papers 4 months ago

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Paper • 2510.08240 • Published Oct 9, 2025 • 41

IA2: Alignment with ICL Activations Improves Supervised Fine-Tuning

Paper • 2509.22621 • Published Sep 26, 2025 • 8

upvoted 2 articles 5 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8, 2025

•

755

Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

Feb 11, 2025

•

106

upvoted a paper 5 months ago

Jointly Reinforcing Diversity and Quality in Language Model Generations

Paper • 2509.02534 • Published Sep 2, 2025 • 25

authored a paper 6 months ago

Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback

Paper • 2506.11930 • Published Jun 13, 2025 • 53

liked a dataset 6 months ago

Dongwei/Feedback_Friction_Dataset

Viewer • Updated Jun 17, 2025 • 394 • 15 • 2

New activity in Dongwei/Feedback_Friction_Dataset 8 months ago

Add link to Github repository

#2 opened 8 months ago by

nielsr

upvoted a paper 8 months ago

Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback

Paper • 2506.11930 • Published Jun 13, 2025 • 53

commented a paper 8 months ago

Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback

Paper • 2506.11930 • Published Jun 13, 2025 • 53 •

updated a dataset 8 months ago

Dongwei/Feedback_Friction_Dataset

Viewer • Updated Jun 17, 2025 • 394 • 15 • 2

published a dataset 8 months ago

Dongwei/Feedback_Friction_Dataset

Viewer • Updated Jun 17, 2025 • 394 • 15 • 2

upvoted a paper 11 months ago

Optimizing Decomposition for Optimal Claim Verification

Paper • 2503.15354 • Published Mar 19, 2025 • 18

liked a model 12 months ago

Dongwei/Qwen-2.5-7B_Base_Math_smalllr

Text Generation • 8B • Updated Feb 5, 2025 • 11 • 6

updated a model 12 months ago

Dongwei/Qwen-2.5-7B_Base_Math_smalllr_newdata

Text Generation • 8B • Updated Feb 13, 2025 • 3

published a model 12 months ago

Dongwei/Qwen-2.5-7B_Base_Math_smalllr_newdata

Text Generation • 8B • Updated Feb 13, 2025 • 3

New activity in Dongwei/Qwen-2.5-7B_Base_Math_smalllr 12 months ago

Wandb log not found

#1 opened 12 months ago by

Yuu1998

updated 2 models 12 months ago

Dongwei/Qwen-2.5-7B_Base_Math_smalllr_longer

Text Generation • 8B • Updated Feb 11, 2025

Dongwei/Qwen-2.5-7B_Base_Math_smallestlr

Text Generation • 8B • Updated Feb 11, 2025 • 20

published a model 12 months ago

Dongwei/Qwen-2.5-7B_Base_Math_smalllr_longer

Text Generation • 8B • Updated Feb 11, 2025

Jiang

AI & ML interests

Recent Activity

Organizations

Dongwei's activity

SmolLM3: smol, multilingual, long-context reasoner

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

Add link to Github repository

Wandb log not found