🌌 Xerv-AI/Ada: The Multi-Modal Mathematical Generalist SLM

Ada is an ultra-lightweight, high-speed, and highly optimized reasoning Small Language Model (SLM) derived from the powerful Qwen2.5-Math-1.5B architecture. Engineered specifically to bridge the gap between hyper-specialized graduate-level mathematical proofs and standard conversational utility, Ada solves the notorious "catastrophic forgetting" problem often found in math-heavy fine-tunes. Whether you need a step-by-step calculus breakdown, a topological proof in LaTeX, or just a simple conversational assistant for daily tasks, Ada delivers state-of-the-art performance for a 1.5 Billion parameter model.

🚀 Model Overview

Standard math-specific LLMs frequently suffer from domain overfitting. When prompted with basic conversational queries, they either hallucinate lengthy pseudo-proofs or fail entirely to understand the user's intent. Xerv-AI/Ada was meticulously engineered to resolve this by utilizing a carefully balanced, dual-distribution training dataset, allowing it to act as both a rigorous STEM assistant and a general-purpose chat model.

Specification	Details
Model Name	Xerv-AI/Ada
Base Architecture	unsloth/Qwen2.5-Math-1.5B
Parameter Count	1.5 Billion
Primary Capabilities	Graduate-level STEM reasoning, logical deduction, and mathematical proofs.
Secondary Capabilities	General conversational instruction-following, roleplay, and basic coding.
Training Framework	QLoRA via Unsloth (Triton kernels).
Precision	Merged 16-bit (Fine-tuned in 4-bit).
License	Apache-2.0
Dataset	Sample Size
:---	:---
Xerv-AI/GRAD	~1.93k rows
yahma/alpaca-cleaned	~2.00k rows

💻 Usage & Python Inference Guide

The model is highly responsive to the standard Alpaca Instruction/Response template. Important Inference Note: For best results, use a repetition_penalty of roughly 1.15. This acts as a crucial guardrail to prevent the model from infinitely looping through mathematical steps on overly simple arithmetic queries. 1. Installation Requirements

pip install unsloth transformers accelerate torch

2. Fast Inference Script

from unsloth import FastLanguageModel
import torch
# Configuration
repo_name = "Xerv-AI/Ada"
max_seq_length = 2048
# Load the model and tokenizer (4-bit recommended for low-VRAM)
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = repo_name,
    max_seq_length = max_seq_length,
    dtype = None,
    load_in_4bit = True, 
)
# Enable optimized inference mode
FastLanguageModel.for_inference(model)
# Define the universal prompt template
universal_prompt = """### Instruction:
{}
### Response:
{}"""
# Prepare your query
query = "Provide a step-by-step logical proof finding the eigenvalues of the matrix [[2, 1], [1, 2]]."
inputs = tokenizer(
    [universal_prompt.format(query, "")],
    return_tensors = "pt"
).to("cuda")
print("Generating analytical response...")
# Generate the output
outputs = model.generate(
    **inputs,
    max_new_tokens = 1024,
    max_length = None,           
    use_cache = True,
    repetition_penalty = 1.15,   # Critical: prevents generation loops
    pad_token_id = tokenizer.eos_token_id
)
# Decode and print the result
response = tokenizer.batch_decode(outputs, skip_special_tokens = True)[0]
print(f"\n{'='*50}\nOutput:\n{'='*50}")
print(response.split("### Response:\n")[-1])

🛡️ Safety & Alignment Guardrails

Despite being fine-tuned on raw mathematical logic and conversational instruction data, Ada successfully retains its foundational safety alignments. Because only 1% to 2% of the parameters were actively updated via LoRA (and subsequently merged), the original base Qwen2.5 weights responsible for safety remain fully intact.

Content Moderation: The model actively refuses to generate explicit, illegal, or harmful content, relying on the RLHF and DPO safety guardrails instilled during Alibaba's original pre-training phase.

⚠️ Limitations & Known Biases

While Ada punches well above its 1.5B weight class, it is important to acknowledge the limitations inherent to Small Language Models:

Arithmetic Hallucinations: Ada is exceptionally capable at symbolic logic, structural breakdowns, and mathematical theory. However, like many SLMs, it can occasionally suffer from minor arithmetic errors (e.g., basic addition/subtraction mistakes) deep within multi-page proofs. Always verify raw calculations.
Language Constraint: The model is optimized exclusively for English text and standard mathematical notation.
Prompt Sensitivity: Ada performs at its absolute peak when math queries explicitly ask for a "proof," "step-by-step breakdown," or "logical analysis" within the instruction block.
World Knowledge: It lacks the broad, encyclopedic trivia knowledge found in massive 70B+ parameter models.

🤝 Acknowledgements

Alibaba Cloud: For the phenomenal, state-of-the-art base Qwen2.5-Math architecture.
Unsloth AI: For the Triton-optimized training kernels that made compiling and fine-tuning this model possible and highly efficient on consumer hardware.
Xerv-AI: For the curation of the GRAD synthetic dataset powering the advanced reasoning capabilities.

Downloads last month: 439

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for Xerv-AI/Ada

Base model

Qwen/Qwen2.5-1.5B

Finetuned

Qwen/Qwen2.5-Math-1.5B

Finetuned

unsloth/Qwen2.5-Math-1.5B

Finetuned

(7)

this model

Xerv-AI
/

Ada

🌌 Xerv-AI/Ada: The Multi-Modal Mathematical Generalist SLM

🚀 Model Overview

💻 Usage & Python Inference Guide

🛡️ Safety & Alignment Guardrails

⚠️ Limitations & Known Biases

🤝 Acknowledgements

Model tree for Xerv-AI/Ada

Datasets used to train Xerv-AI/Ada

Space using Xerv-AI/Ada 1