YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Model finetune origin: https://huggingface.co/yarin-shaked/Qwen3-Codeforces-GRPO

Model Card for Hugston-Qwen3-Codeforces-GRPO This model is a fine-tuned version of Qwen/Qwen3-0.6B on the open-r1/codeforces dataset. It has been trained using TRL.

This model was converted and Quantized by Hugston Team.

You can use the model with HugstonOne Enterprise Edition

Screenshot 2025-11-17 174352

Tested.

Watch HugstonOne coding and preview in action: https://vimeo.com/1121493834?share=copy&fl=sv&fe=ci Usage -Download App HugstonOne at Hugston.com or at https://github.com/Mainframework -Download model from https://hugston.com/explore?folder=llm_models or Huggingface -If you already have the Llm Model downloaded chose it by clicking pick model in HugstonOne -Then click Load model in Cli or Server

-For multimodal use you need a VL/multimodal LLM model with the Mmproj file in the same folder. -Select model and select mmproj.

-Note: if the mmproj is inside the same folder with other models non multimodal, the non model will not load unless the mmproj is moved from folder.


Training procedure Visualize in Weights & Biases

This model was trained with GRPO, a method introduced in DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models.

Framework versions TRL: 0.18.0 Transformers: 4.52.3 Pytorch: 2.6.0 Datasets: 4.2.0 Tokenizers: 0.21.4 Citations Cite GRPO as:

@article{zhihong2024deepseekmath, title = {{DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models}}, author = {Zhihong Shao and Peiyi Wang and Qihao Zhu and Runxin Xu and Junxiao Song and Mingchuan Zhang and Y. K. Li and Y. Wu and Daya Guo}, year = 2024, eprint = {arXiv:2402.03300}, } Cite TRL as:

@misc{vonwerra2022trl, title = {{TRL: Transformer Reinforcement Learning}}, author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{'e}dec}, year = 2020, journal = {GitHub repository}, publisher = {GitHub}, howpublished = {\url{https://github.com/huggingface/trl}} }

Downloads last month
104
GGUF
Model size
0.6B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support