NagaLLaMA-3.2-3B-Instruct-Merged

NagaLLaMA-3.2-3B-Instruct-Merged is the standalone, merged version of the NagaLLaMA-3.2-3B-Instruct LoRA adapter. It combines the fine-tuned Nagamese weights directly with the Llama-3.2-3B-Instruct base model.

This model is optimized for easier deployment (e.g., vLLM, TGI, or GGUF conversion) as it does not require loading adapters separately. It serves as a general-purpose instruction-following assistant for the Nagamese language (Naga Pidgin/Creole).

Model Details

  • Developer: Agniva Maiti
  • Base Model: meta-llama/Llama-3.2-3B-Instruct
  • Language: Nagamese (nag)
  • Format: Merged Weights (Safetensors)
  • Precision: fp16

Training Data

The model was trained on the NagaNLP Conversational Corpus, which contains 10,021 Nagamese instruction-following pairs.

Data Splitting:

  • Training: 80% (approx. 8,000 samples)
  • Validation: 10%
  • Test: 10%

This model corresponds to the final checkpoint trained on 100% of the available training split.

Training Hyperparameters (Original Adapter)

  • Epochs: 3
  • Batch Size: 2 (per device) with 8 gradient accumulation steps
  • Sequence Length: 512
  • Learning Rate: 2e-4
  • LoRA Rank (r): 16
  • LoRA Alpha: 32

Intended Use

This model is intended for:

  • Chatbots and assistants requiring Nagamese language support.
  • Direct deployment in inference engines (vLLM, Ollama) without adapter management.
  • Research into low-resource language modeling.

How to Use

Since this is a merged model, you do not need peft. You can load it directly with transformers.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "agnivamaiti/NagaLLaMA-3.2-3B-Instruct-Merged"

# Load Model
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_id)

# Inference
prompt = "Machine Learning ki ase aru kote use hoi?"

messages = [
    {"role": "user", "content": prompt},
]

# Apply chat template
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")

outputs = model.generate(
    **inputs,
    max_new_tokens=150,      
    do_sample=True,
    temperature=0.3,
    top_k=15,               
    top_p=0.3,
    repetition_penalty=1.2,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.pad_token_id
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations & Safety

  • Hallucinations: Like all LLMs, this model may generate incorrect information.
  • Bias: The model inherits biases from the base Llama 3.2 model and the specific dialectal patterns found in the training data.
  • Critical Use: Not suitable for medical, legal, or financial advice.

Credits

  • Acknowledgments: Special thanks to the friends who validated the dataset and model outputs, and to RespAI Lab, KIIT for supporting the research and publication of this work.
Downloads last month
3
Safetensors
Model size
3B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for agnivamaiti/NagaLLaMA-3.2-3B-Instruct-Merged

Finetuned
(881)
this model

Space using agnivamaiti/NagaLLaMA-3.2-3B-Instruct-Merged 1

Collection including agnivamaiti/NagaLLaMA-3.2-3B-Instruct-Merged