Qwen3-30B-A3B-Kaidol-v5
Korean character roleplay model fine-tuned from Qwen3-30B-A3B-Instruct-2507.
v5 features deeper style learning with expanded LoRA target modules (Attention + FFN).
Model Description
This model is optimized for Korean character roleplay conversations, trained with custom character datasets featuring distinct personalities, speech patterns, and emotional expressions.
Key Improvements over v4
- Expanded Target Modules: Added FFN layers (gate_proj, up_proj, down_proj) for better vocabulary and style learning
- Increased LoRA Capacity: Rank 64 (vs 32 in v4) with alpha 128 (vs 64 in v4)
- Lower Loss: Final loss 0.236 (vs 0.833 in v4)
- Higher Token Accuracy: 93.8% (vs 80.2% in v4)
Training Details
| Parameter | Value |
|---|---|
| Base Model | Qwen/Qwen3-30B-A3B-Instruct-2507 |
| Method | LoRA (merged) |
| LoRA Rank | 64 |
| LoRA Alpha | 128 |
| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Training Epochs | 3 |
| Learning Rate | 1e-5 |
| Batch Size | 2 x 16 (gradient accumulation) |
| Max Sequence Length | 2048 |
| Final Loss | 0.236 |
| Token Accuracy | 93.8% |
Target Modules (7 total)
- Attention: q_proj, k_proj, v_proj, o_proj
- FFN: gate_proj, up_proj, down_proj
Comparison: v4 vs v5
| Metric | v4 | v5 |
|---|---|---|
| LoRA Rank | 32 | 64 |
| Target Modules | 4 (Attention) | 7 (Attention + FFN) |
| Final Loss | 0.833 | 0.236 |
| Token Accuracy | 80.2% | 93.8% |
Intended Use
This model is designed for:
- Korean character roleplay conversations
- Interactive storytelling
- Character-based chat applications
- Creative writing assistance
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "developer-lunark/Qwen3-30B-A3B-Kaidol-v5"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "์๋
ํ์ธ์!"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
vLLM Serving
vllm serve developer-lunark/Qwen3-30B-A3B-Kaidol-v5 \
--tensor-parallel-size 2 \
--max-model-len 8192
Limitations
- Optimized for Korean language; performance in other languages may vary
- Character roleplay focused; may not be optimal for factual Q&A
- Inherits limitations from the base Qwen3 model
License
Apache 2.0 (following the base model license)
Related Models
- Qwen3-30B-A3B-Kaidol-v4 - Attention-only LoRA version
Acknowledgments
- Base model: Qwen/Qwen3-30B-A3B-Instruct-2507
- Fine-tuned by developer-lunark
- Downloads last month
- 6
Model tree for developer-lunark/Qwen3-30B-A3B-Kaidol-v5
Base model
Qwen/Qwen3-30B-A3B-Instruct-2507