YAML Metadata Warning:The pipeline tag "causal-lm" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, image-text-to-image, image-text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other

Stack X Ultimate

A state-of-the-art agentic coding model built on Qwen2.5-Coder-3B-Instruct

Stack X is a LoRA adapter trained on a curated mix of real agentic conversations, designed to make open-weight models better at multi-step tool use, code generation, and complex reasoning tasks.


Model Details

  • Base Model: Qwen/Qwen2.5-Coder-3B-Instruct
  • Architecture: Transformer (3B parameters)
  • Training Type: QLoRA (LoRA rank 32, 7 modules targeted)
  • Trained by: Walid Sobhie via OpenClaw agentic pipeline
  • Framework: Hugging Face Transformers + PEFT + PyTorch bf16
  • Training Hardware: NVIDIA V100-SXM2-16GB (GCP spot instance)
  • Training Steps: 3,000 steps (curriculum sorted, cosine LR decay)
  • Effective Batch Size: 16 (gradient accumulation)
  • Max Context: 1,536 tokens

Training Data

Source Description Count
NVIDIA Nemotron Agentic Real multi-step tool calling conversations ~7,000
Stack-4.0 Smart High-complexity agentic tasks ~10,000
Stack-4.0 Tools Diverse tool-use patterns ~10,000
Total (deduped) After deduplication ~6,100

Training data was filtered, deduplicated, and sorted by complexity (curriculum learning) before training.


Capabilities

Stack X is designed to excel at:

  • Multi-step tool use β€” chains multiple tool calls with proper reasoning
  • Code generation β€” Python, JavaScript, shell, and more
  • Debugging β€” finds and explains bugs with fixes
  • Math & reasoning β€” step-by-step calculation and problem solving
  • Research tasks β€” information retrieval and synthesis

Usage

With PEFT (recommended β€” preserves base model)

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

BASE = "Qwen/Qwen2.5-Coder-3B-Instruct"
ADAPTER = "my-ai-stack/Stack-X-Ultimate"

tokenizer = AutoTokenizer.from_pretrained(BASE, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

base = AutoModelForCausalLM.from_pretrained(BASE, torch_dtype="bfloat16", device_map="auto")
model = PeftModel.from_pretrained(base, ADAPTER)

# Chat
messages = [{"role": "user", "content": "Use the calculate tool to find sqrt(144)"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Merged (full model)

# See: my-ai-stack/Stack-X-Ultimate-Merged
from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("my-ai-stack/Stack-X-Ultimate-Merged", torch_dtype="bfloat16", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("my-ai-stack/Stack-X-Ultimate-Merged")

Performance

Benchmark Score
HumanEval (0-shot) TBD
Agentic tool call TBD
Reasoning (commonsense) TBD

Evaluation results will be posted after training completes.


Limitations

  • LoRA adapter requires compatible base model (Qwen2.5-Coder-3B-Instruct)
  • Max context 1,536 tokens β€” not suitable for very long documents
  • Trained primarily in English β€” other language performance may vary
  • Tool use limited to the patterns seen in training data

Training Recipe

Base model:        Qwen/Qwen2.5-Coder-3B-Instruct
LoRA rank:         32 (59M trainable params)
LoRA alpha:        64
Target modules:    q_proj, k_proj, v_proj, o_proj,
                   gate_proj, up_proj, down_proj
Learning rate:     2e-4 (cosine decay)
Warmup:            150 steps
Batch size:        1 Γ— gradient_accumulation=16
Optimizer:         AdamW (bf16)
Max grad norm:     0.5
Weight decay:      0.1
Mixed precision:   bf16
Gradient checkpointing: enabled

Citation

@misc{stackx2026,
  title={Stack X Ultimate},
  author={Walid Sobhie},
  year={2026},
  url={https://huggingface.co/my-ai-stack/Stack-X-Ultimate}
}

Disclaimer

This model is provided as-is. Training was performed automatically via an OpenClaw agentic pipeline. Results may vary. Not reviewed for safety in production deployments.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for my-ai-stack/Stack-X-Ultimate-Merged

Base model

Qwen/Qwen2.5-3B
Adapter
(33)
this model

Datasets used to train my-ai-stack/Stack-X-Ultimate-Merged