arxiv:2504.16828
Muhammad Khalifa
mkhalifa
AI & ML interests
natural language genration, reinforcement learning
Recent Activity
upvoted a paper about 1 month ago
Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges liked a dataset about 1 month ago
nvidia/Nemotron-Personas-Korea updated a dataset about 1 month ago
launch/thinkprm-1K-verification-cots