license: cc-by-nc-4.0 tags: - translation - nllb

My NLLB-200 Translator

This repository contains a copy of Meta's (Facebook) NLLB-200-distilled-600M model. It has been cloned here for custom personal access and application deployment.

🌟 Model Details

  • Original Developer: Meta AI (Facebook)
  • Model Type: Seq2Seq Language Model (Machine Translation)
  • Model Size: 600 Million parameters
  • License: CC-BY-NC-4.0 (Non-commercial use only)

🌍 Language Support

This model supports direct translation between 200+ languages. For example:

  • English: eng_Latn
  • Telugu: tel_Telu
  • Hindi: hin_Deva
  • French: fra_Latn

πŸš€ How to Get Started

You can use this model directly with the Hugging Face transformers library:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# Replace with your actual repository path
model_name = "YOUR_USERNAME/YOUR_REPO_NAME"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

# Set source language
tokenizer.src_lang = "eng_Latn"

text = "Hello, how are you today?"
inputs = tokenizer(text, return_tensors="pt")

# Target translation (Example: Telugu)
translated_tokens = model.generate(
    **inputs,
    forced_bos_token_id=tokenizer.convert_tokens_to_ids("tel_Telu"),
    max_length=50
)

output = tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0]
print("Translation:", output)

## Citation
@article{nllbteam2022neglected,
  title={No Language Left Behind: Scaling Human-Centered Machine Translation},
  author={NLLB Team and Marta R. Costa-jussΓ  and James Cross and Onur Γ‡elebi and Maha Elbayad and Kenneth Heafield and others},
  journal={arXiv preprint arXiv:2207.04672},
  year={2022}
}
Downloads last month
56
Safetensors
Model size
0.6B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Paper for SatLlama/AI_Translator