Joint Selection for Large-Scale Pre-Training Data via Policy Gradient-based Mask Learning Paper • 2512.24265 • Published Dec 30, 2025 • 4
UltraData Collection Ultra Scale, Ultra Quality, Ultra Coverage • 9 items • Updated 6 days ago • 73
nvidia/Nemotron-Pretraining-Specialized-v1 Viewer • Updated Dec 22, 2025 • 60.7M • 4.43k • 69
Does your data spark joy? Performance gains from domain upsampling at the end of training Paper • 2406.03476 • Published Jun 5, 2024 • 4
agentica-org/DeepScaleR-1.5B-Preview Text Generation • 2B • Updated Apr 9, 2025 • 23.9k • 578