deepseek-r1-qwen3-cybersec-merged

Model Description

This is a MERGED version of the DoRA fine-tuned DeepSeek-R1-Qwen3-8B model for cybersecurity.

Original adapter model: sainikhiljuluri2015/deepseek-r1-qwen3-cybersec

This merged model combines the base model weights with the DoRA adapter weights into a single standalone model.

Key Features

🔐 Specialized for cybersecurity and network intrusion detection
🎯 Trained with RAFT methodology for citation-aware responses
⚡ Merged model - no PEFT dependency required
📦 Standalone model ready to use

Model Details

Base Model: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
Adapter: sainikhiljuluri2015/deepseek-r1-qwen3-cybersec
Training Method: DoRA + RAFT + 4-bit quantization
Model Type: Merged (base + adapter combined)
Size: ~16-32 GB (full precision)

Usage

Direct Loading (No PEFT Required!)

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load merged model directly
model = AutoModelForCausalLM.from_pretrained(
   "sainikhiljuluri2015/deepseek-r1-qwen3-cybersec-merged",
   torch_dtype=torch.bfloat16,
   device_map="auto",
   trust_remote_code=True
)

tokenizer = AutoTokenizer.from_pretrained("sainikhiljuluri2015/deepseek-r1-qwen3-cybersec-merged")

# Generate
prompt = "Analyze this network traffic for security threats..."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Differences from Adapter Model

Aspect	Adapter Model	Merged Model (This)
Size	~800 MB	~16-32 GB
Dependencies	Requires PEFT	Transformers only
Loading	Two-step (base + adapter)	Single-step
Storage Cost	~$3/month	~$60/month
Use Case	Recommended for most	When PEFT not available

Performance

Same performance as the original adapter model since this is just a merged version.

Citation

@misc{deepseek_r1_qwen3_cybersec_merged,
 author = {sainikhiljuluri2015},
 title = {DeepSeek-R1-Qwen3 Cybersecurity Model (Merged)},
 year = {2025},
 publisher = {HuggingFace},
 howpublished = {\url{https://huggingface.co/sainikhiljuluri2015/deepseek-r1-qwen3-cybersec-merged}}
}

License

Apache 2.0

Acknowledgments

Base Model: DeepSeek-AI
Training: DoRA + RAFT methodology
Original Adapter: sainikhiljuluri2015/deepseek-r1-qwen3-cybersec

Downloads last month: 3

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for sainikhiljuluri2015/deepseek-r1-qwen3-cybersec-merged

Base model

deepseek-ai/DeepSeek-R1-0528-Qwen3-8B

Finetuned

(39)

this model