Qwen3-30B-A3B-Kaidol-v5

Korean character roleplay model fine-tuned from Qwen3-30B-A3B-Instruct-2507.

v5 features deeper style learning with expanded LoRA target modules (Attention + FFN).

Model Description

This model is optimized for Korean character roleplay conversations, trained with custom character datasets featuring distinct personalities, speech patterns, and emotional expressions.

Key Improvements over v4

  • Expanded Target Modules: Added FFN layers (gate_proj, up_proj, down_proj) for better vocabulary and style learning
  • Increased LoRA Capacity: Rank 64 (vs 32 in v4) with alpha 128 (vs 64 in v4)
  • Lower Loss: Final loss 0.236 (vs 0.833 in v4)
  • Higher Token Accuracy: 93.8% (vs 80.2% in v4)

Training Details

Parameter Value
Base Model Qwen/Qwen3-30B-A3B-Instruct-2507
Method LoRA (merged)
LoRA Rank 64
LoRA Alpha 128
Target Modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Epochs 3
Learning Rate 1e-5
Batch Size 2 x 16 (gradient accumulation)
Max Sequence Length 2048
Final Loss 0.236
Token Accuracy 93.8%

Target Modules (7 total)

  • Attention: q_proj, k_proj, v_proj, o_proj
  • FFN: gate_proj, up_proj, down_proj

Comparison: v4 vs v5

Metric v4 v5
LoRA Rank 32 64
Target Modules 4 (Attention) 7 (Attention + FFN)
Final Loss 0.833 0.236
Token Accuracy 80.2% 93.8%

Intended Use

This model is designed for:

  • Korean character roleplay conversations
  • Interactive storytelling
  • Character-based chat applications
  • Creative writing assistance

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "developer-lunark/Qwen3-30B-A3B-Kaidol-v5"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "์•ˆ๋…•ํ•˜์„ธ์š”!"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

vLLM Serving

vllm serve developer-lunark/Qwen3-30B-A3B-Kaidol-v5 \
    --tensor-parallel-size 2 \
    --max-model-len 8192

Limitations

  • Optimized for Korean language; performance in other languages may vary
  • Character roleplay focused; may not be optimal for factual Q&A
  • Inherits limitations from the base Qwen3 model

License

Apache 2.0 (following the base model license)

Related Models

Acknowledgments

Downloads last month
6
Safetensors
Model size
31B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for developer-lunark/Qwen3-30B-A3B-Kaidol-v5

Adapter
(8)
this model