Multilingual
updated
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large
Language Models in 167 Languages
Paper
• 2309.09400
• Published
• 87
Tuning LLMs with Contrastive Alignment Instructions for Machine
Translation in Unseen, Low-resource Languages
Paper
• 2401.05811
• Published
• 8
Is Preference Alignment Always the Best Option to Enhance LLM-Based
Translation? An Empirical Analysis
Paper
• 2409.20059
• Published
• 16
Are Character-level Translations Worth the Wait? Comparing Character-
and Subword-level Models for Machine Translation
Paper
• 2302.14220
• Published
Cut Your Losses in Large-Vocabulary Language Models
Paper
• 2411.09009
• Published
• 49
How Do Multilingual Models Remember? Investigating Multilingual Factual
Recall Mechanisms
Paper
• 2410.14387
• Published
• 1
Babel: Open Multilingual Large Language Models Serving Over 90% of
Global Speakers
Paper
• 2503.00865
• Published
• 64
From Bytes to Ideas: Language Modeling with Autoregressive U-Nets
Paper
• 2506.14761
• Published
• 17
When Life Gives You Samples: The Benefits of Scaling up Inference
Compute for Multilingual LLMs
Paper
• 2506.20544
• Published
• 10
SambaLingo: Teaching Large Language Models New Languages
Paper
• 2404.05829
• Published
• 13