NagaLLaMA-3.2-3B-Instruct-Merged
NagaLLaMA-3.2-3B-Instruct-Merged is the standalone, merged version of the NagaLLaMA-3.2-3B-Instruct LoRA adapter. It combines the fine-tuned Nagamese weights directly with the Llama-3.2-3B-Instruct base model.
This model is optimized for easier deployment (e.g., vLLM, TGI, or GGUF conversion) as it does not require loading adapters separately. It serves as a general-purpose instruction-following assistant for the Nagamese language (Naga Pidgin/Creole).
Model Details
- Developer: Agniva Maiti
- Base Model: meta-llama/Llama-3.2-3B-Instruct
- Language: Nagamese (
nag) - Format: Merged Weights (Safetensors)
- Precision: fp16
Training Data
The model was trained on the NagaNLP Conversational Corpus, which contains 10,021 Nagamese instruction-following pairs.
Data Splitting:
- Training: 80% (approx. 8,000 samples)
- Validation: 10%
- Test: 10%
This model corresponds to the final checkpoint trained on 100% of the available training split.
Training Hyperparameters (Original Adapter)
- Epochs: 3
- Batch Size: 2 (per device) with 8 gradient accumulation steps
- Sequence Length: 512
- Learning Rate: 2e-4
- LoRA Rank (r): 16
- LoRA Alpha: 32
Intended Use
This model is intended for:
- Chatbots and assistants requiring Nagamese language support.
- Direct deployment in inference engines (vLLM, Ollama) without adapter management.
- Research into low-resource language modeling.
How to Use
Since this is a merged model, you do not need peft. You can load it directly with transformers.
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "agnivamaiti/NagaLLaMA-3.2-3B-Instruct-Merged"
# Load Model
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
# Inference
prompt = "Machine Learning ki ase aru kote use hoi?"
messages = [
{"role": "user", "content": prompt},
]
# Apply chat template
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(
**inputs,
max_new_tokens=150,
do_sample=True,
temperature=0.3,
top_k=15,
top_p=0.3,
repetition_penalty=1.2,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.pad_token_id
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Limitations & Safety
- Hallucinations: Like all LLMs, this model may generate incorrect information.
- Bias: The model inherits biases from the base Llama 3.2 model and the specific dialectal patterns found in the training data.
- Critical Use: Not suitable for medical, legal, or financial advice.
Credits
- Acknowledgments: Special thanks to the friends who validated the dataset and model outputs, and to RespAI Lab, KIIT for supporting the research and publication of this work.
- Downloads last month
- 3
Model tree for agnivamaiti/NagaLLaMA-3.2-3B-Instruct-Merged
Base model
meta-llama/Llama-3.2-3B-Instruct