deepseek-r1-qwen3-cybersec-merged

Model Description

This is a MERGED version of the DoRA fine-tuned DeepSeek-R1-Qwen3-8B model for cybersecurity.

Original adapter model: sainikhiljuluri2015/deepseek-r1-qwen3-cybersec

This merged model combines the base model weights with the DoRA adapter weights into a single standalone model.

Key Features

  • 🔐 Specialized for cybersecurity and network intrusion detection
  • 🎯 Trained with RAFT methodology for citation-aware responses
  • ⚡ Merged model - no PEFT dependency required
  • 📦 Standalone model ready to use

Model Details

  • Base Model: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
  • Adapter: sainikhiljuluri2015/deepseek-r1-qwen3-cybersec
  • Training Method: DoRA + RAFT + 4-bit quantization
  • Model Type: Merged (base + adapter combined)
  • Size: ~16-32 GB (full precision)

Usage

Direct Loading (No PEFT Required!)

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load merged model directly
model = AutoModelForCausalLM.from_pretrained(
   "sainikhiljuluri2015/deepseek-r1-qwen3-cybersec-merged",
   torch_dtype=torch.bfloat16,
   device_map="auto",
   trust_remote_code=True
)

tokenizer = AutoTokenizer.from_pretrained("sainikhiljuluri2015/deepseek-r1-qwen3-cybersec-merged")

# Generate
prompt = "Analyze this network traffic for security threats..."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Differences from Adapter Model

Aspect Adapter Model Merged Model (This)
Size ~800 MB ~16-32 GB
Dependencies Requires PEFT Transformers only
Loading Two-step (base + adapter) Single-step
Storage Cost ~$3/month ~$60/month
Use Case Recommended for most When PEFT not available

Performance

Same performance as the original adapter model since this is just a merged version.

Citation

@misc{deepseek_r1_qwen3_cybersec_merged,
 author = {sainikhiljuluri2015},
 title = {DeepSeek-R1-Qwen3 Cybersecurity Model (Merged)},
 year = {2025},
 publisher = {HuggingFace},
 howpublished = {\url{https://huggingface.co/sainikhiljuluri2015/deepseek-r1-qwen3-cybersec-merged}}
}

License

Apache 2.0

Acknowledgments

  • Base Model: DeepSeek-AI
  • Training: DoRA + RAFT methodology
  • Original Adapter: sainikhiljuluri2015/deepseek-r1-qwen3-cybersec
Downloads last month
3
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sainikhiljuluri2015/deepseek-r1-qwen3-cybersec-merged

Finetuned
(39)
this model