Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
s
august66
Follow
Kyleyee's profile picture
mamba413's profile picture
callmespring's profile picture
3 followers
·
2 following
AI & ML interests
None yet
Recent Activity
updated
a dataset
about 2 hours ago
august66/drpo_hh_qwen2.5_1.5b_with_ref_prob_sampled
published
a dataset
about 15 hours ago
august66/drpo_hh_qwen2.5_1.5b_with_ref_prob_sampled
updated
a model
2 days ago
august66/hh_qwen1.5_drpo
View all activity
Organizations
august66
's datasets
27
Sort: Recently updated
august66/drpo_hh_qwen2.5_1.5b_with_ref_prob_sampled
Updated
about 2 hours ago
•
6
august66/drpo_hh_qwen2.5_1.5b_with_ref_btpref
Viewer
•
Updated
Oct 8, 2025
•
48.8k
•
191
august66/hh_qwen2.5_1.5b_with_bias_bt_pref
Viewer
•
Updated
Oct 2, 2025
•
18k
•
2
august66/hh_qwen2.5_1.5b_with_bias
Viewer
•
Updated
Sep 27, 2025
•
18k
•
27
august66/drpo_hh_qwen2.5_1.5b
Viewer
•
Updated
Sep 8, 2025
•
43.8k
•
3
august66/dpo_reward_dist_pi_theta_prompt_3
Viewer
•
Updated
Sep 3, 2025
•
5k
•
1
august66/dpo_reward_dist_pi_theta_prompt_2
Viewer
•
Updated
Sep 3, 2025
•
5k
•
1
august66/dpo_reward_dist_pi_theta
Viewer
•
Updated
Aug 23, 2025
•
5k
•
1
august66/reward_distribution_2_tldr_openassist_pi_ref
Viewer
•
Updated
Aug 4, 2025
•
5k
•
5
august66/reward_distribution_2_tldr_openassist_pi_theta
Viewer
•
Updated
Aug 4, 2025
•
5k
•
13
august66/reward_distribution_tldr_openassist_pi_theta
Viewer
•
Updated
Jul 30, 2025
•
5k
•
14
august66/reward_distribution_tldr_openassist_pi_ref
Viewer
•
Updated
Jul 30, 2025
•
5k
•
16
august66/drpo_ultrafeedback_qwen2.5-1.5b_first_iter_20k
Viewer
•
Updated
Jul 8, 2025
•
20k
august66/drpo_ultrafeedback_qwen2.5-1.5b-7
Viewer
•
Updated
Jul 8, 2025
•
2.5k
•
2
august66/drpo_ultrafeedback_qwen2.5-1.5b-6
Viewer
•
Updated
Jul 8, 2025
•
2.5k
•
1
august66/drpo_ultrafeedback_qwen2.5-1.5b-5
Viewer
•
Updated
Jul 8, 2025
•
1.5k
•
1
august66/drpo_ultrafeedback_qwen2.5-1.5b-4
Viewer
•
Updated
Jul 8, 2025
•
1k
•
1
august66/drpo_ultrafeedback_qwen2.5-1.5b-3
Viewer
•
Updated
Jul 8, 2025
•
2.5k
•
1
august66/drpo_ultrafeedback_qwen2.5-1.5b-2
Viewer
•
Updated
Jul 7, 2025
•
5k
•
1
august66/drpo_ultrafeedback_qwen2.5-1.5b-1
Viewer
•
Updated
Jul 7, 2025
•
5k
•
1
august66/drpo_ultrafeedback_qwen2.5-1.5b
Viewer
•
Updated
Jul 7, 2025
•
30
•
1
august66/DRPO_data_from_ultrafeed_new_template
Viewer
•
Updated
Jun 29, 2025
•
64k
•
1
august66/DRPO_data_from_ultrafeed
Viewer
•
Updated
Jun 24, 2025
•
64k
•
1
august66/DRPO_first_iter_completion_label_test
Viewer
•
Updated
Jun 16, 2025
•
200
•
1
august66/DRPO_first_iter
Viewer
•
Updated
Jun 16, 2025
•
20k
•
1
august66/dpo_train_data
Viewer
•
Updated
Jun 3, 2025
•
25k
•
1
august66/reward_data_for_dpo_train
Viewer
•
Updated
Jun 2, 2025
•
25k
•
1