view post Post 2547 Hey everyone! Just wanted to share this awesome dataset that features over 1 million tokens specifically for the Egyptian dialect. Check it HeshamHaroon/1milion_token_EGY_songs
Fav dataset HuggingFaceFW/fineweb Viewer • Updated Jul 11, 2025 • 52.5B • 199k • 2.61k HeshamHaroon/ArzEn-MultiGenre Viewer • Updated Dec 31, 2023 • 26k • 932 • 12 gretelai/synthetic_text_to_sql Viewer • Updated Dec 16, 2025 • 106k • 2.52k • 624 Anthropic/persuasion Viewer • Updated Apr 9, 2024 • 3.94k • 159 • 199
My favorite models meta-llama/Meta-Llama-3-8B Text Generation • 8B • Updated Sep 27, 2024 • 1.55M • • 6.44k meta-llama/Meta-Llama-3-8B-Instruct Text Generation • 8B • Updated Jun 18, 2025 • 1.43M • • 4.36k ai21labs/Jamba-v0.1 Text Generation • 52B • Updated Sep 11, 2024 • 847 • 1.19k CohereLabs/c4ai-command-r-plus Text Generation • 104B • Updated Apr 16, 2025 • 1.98k • 1.77k
Fav dataset HuggingFaceFW/fineweb Viewer • Updated Jul 11, 2025 • 52.5B • 199k • 2.61k HeshamHaroon/ArzEn-MultiGenre Viewer • Updated Dec 31, 2023 • 26k • 932 • 12 gretelai/synthetic_text_to_sql Viewer • Updated Dec 16, 2025 • 106k • 2.52k • 624 Anthropic/persuasion Viewer • Updated Apr 9, 2024 • 3.94k • 159 • 199
My favorite models meta-llama/Meta-Llama-3-8B Text Generation • 8B • Updated Sep 27, 2024 • 1.55M • • 6.44k meta-llama/Meta-Llama-3-8B-Instruct Text Generation • 8B • Updated Jun 18, 2025 • 1.43M • • 4.36k ai21labs/Jamba-v0.1 Text Generation • 52B • Updated Sep 11, 2024 • 847 • 1.19k CohereLabs/c4ai-command-r-plus Text Generation • 104B • Updated Apr 16, 2025 • 1.98k • 1.77k
HeshamHaroon/whisper-small-with-google-fleurs-ar Automatic Speech Recognition • Updated Feb 8, 2024 • 1