Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
tuandunghcmut 's Collections
RL-Papers
MT-LLM
Visual Chain-of-Thought Reasoning Benchmarks
LLM for Security Benchmarks/Datasets
Visual-CoT/GCoT related
Text Embedding Papers
Quantized versions of LLMs/MLLMs
Multilingual Sentiment Analysis Dataset
LLM Series
LLM/MLLM (20B - 80B, fit on 1-2 A100/H100)
SLM
MLLM (100B - 300B)
Benchmarks for evaluating LLMs/MLLMs
Conversation Dataset
Multilingual Parallel Text Corpus
Multilingual Pretraining Corpus for Southeast Asian Language

RL-Papers

updated 6 days ago
Upvote
1

  • A Survey of Reinforcement Learning for Large Reasoning Models

    Paper • 2509.08827 • Published Sep 10 • 189

  • Agent Lightning: Train ANY AI Agents with Reinforcement Learning

    Paper • 2508.03680 • Published Aug 5 • 121

  • Aligning Multimodal LLM with Human Preference: A Survey

    Paper • 2503.14504 • Published Mar 18 • 26
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs