deepseek-r1-qwen3-cybersec-merged
Model Description
This is a MERGED version of the DoRA fine-tuned DeepSeek-R1-Qwen3-8B model for cybersecurity.
Original adapter model: sainikhiljuluri2015/deepseek-r1-qwen3-cybersec
This merged model combines the base model weights with the DoRA adapter weights into a single standalone model.
Key Features
- 🔐 Specialized for cybersecurity and network intrusion detection
- 🎯 Trained with RAFT methodology for citation-aware responses
- ⚡ Merged model - no PEFT dependency required
- 📦 Standalone model ready to use
Model Details
- Base Model: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
- Adapter: sainikhiljuluri2015/deepseek-r1-qwen3-cybersec
- Training Method: DoRA + RAFT + 4-bit quantization
- Model Type: Merged (base + adapter combined)
- Size: ~16-32 GB (full precision)
Usage
Direct Loading (No PEFT Required!)
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load merged model directly
model = AutoModelForCausalLM.from_pretrained(
"sainikhiljuluri2015/deepseek-r1-qwen3-cybersec-merged",
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("sainikhiljuluri2015/deepseek-r1-qwen3-cybersec-merged")
# Generate
prompt = "Analyze this network traffic for security threats..."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Differences from Adapter Model
| Aspect | Adapter Model | Merged Model (This) |
|---|---|---|
| Size | ~800 MB | ~16-32 GB |
| Dependencies | Requires PEFT | Transformers only |
| Loading | Two-step (base + adapter) | Single-step |
| Storage Cost | ~$3/month | ~$60/month |
| Use Case | Recommended for most | When PEFT not available |
Performance
Same performance as the original adapter model since this is just a merged version.
Citation
@misc{deepseek_r1_qwen3_cybersec_merged,
author = {sainikhiljuluri2015},
title = {DeepSeek-R1-Qwen3 Cybersecurity Model (Merged)},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/sainikhiljuluri2015/deepseek-r1-qwen3-cybersec-merged}}
}
License
Apache 2.0
Acknowledgments
- Base Model: DeepSeek-AI
- Training: DoRA + RAFT methodology
- Original Adapter: sainikhiljuluri2015/deepseek-r1-qwen3-cybersec
- Downloads last month
- 3
Model tree for sainikhiljuluri2015/deepseek-r1-qwen3-cybersec-merged
Base model
deepseek-ai/DeepSeek-R1-0528-Qwen3-8B