AI & ML interests
None defined yet.
Recent Activity
textcleanlm/essentialweb-1.0-10B-clean-content
Viewer
• Updated • 9.32M • 41
textcleanlm/essentialweb-1.0-10B-raw-content
Viewer
• Updated • 9.32M • 51
textcleanlm/essentialweb-1.0-sample-10B
Viewer
• Updated • 9.32M • 94
Viewer
• Updated • 2.98M • 16
textcleanlm/med-domain-5b
Viewer
• Updated • 4.07M • 19
textcleanlm/med-domain-data-sample1
Viewer
• Updated • 814k • 8
textcleanlm/med-domain-data-sample
Viewer
• Updated • 8.1k • 9
textcleanlm/fineweb-sample-10BT
Viewer
• Updated • 14.9M • 38
textcleanlm/training-data-2
Viewer
• Updated • 66.3k • 36
textcleanlm/textclean-10B
Viewer
• Updated • 9.77M • 184
textcleanlm/textclean-2B-raw-cleaned
Viewer
• Updated • 1.95M • 18
textcleanlm/textclean-2B-raw-sample
Viewer
• Updated • 100 • 6
textcleanlm/textclean-2B-raw
Viewer
• Updated • 1.97M • 8
textcleanlm/textclean-sft
Viewer
• Updated • 894k • 5
Viewer
• Updated • 91.7k • 5
textcleanlm/textclean-200M
Viewer
• Updated • 581k • 6
textcleanlm/100M-raw-webtext-to-denoised-text
Viewer
• Updated • 179k • 103
textcleanlm/annotation_example
Viewer
• Updated • 1.82k • 66
Viewer
• Updated • 1.82k • 68
textcleanlm/textclean-20M
Viewer
• Updated • 18.3k • 128
textcleanlm/textclean-corpus-10M-deepseek-ablation
Viewer
• Updated • 18.1k • 6
textcleanlm/textclean-corpus-1M-variant-ablation-research
Viewer
• Updated • 1.82k • 67
textcleanlm/textclean-corpus-1M-old
Viewer
• Updated • 1.82k • 65
• 1
textcleanlm/textclean-corpus-1M-o4-mini
Viewer
• Updated • 1.82k • 65