π Xerv-AI/Ada: The Multi-Modal Mathematical Generalist SLM
Ada is an ultra-lightweight, high-speed, and highly optimized reasoning Small Language Model (SLM) derived from the powerful Qwen2.5-Math-1.5B architecture. Engineered specifically to bridge the gap between hyper-specialized graduate-level mathematical proofs and standard conversational utility, Ada solves the notorious "catastrophic forgetting" problem often found in math-heavy fine-tunes. Whether you need a step-by-step calculus breakdown, a topological proof in LaTeX, or just a simple conversational assistant for daily tasks, Ada delivers state-of-the-art performance for a 1.5 Billion parameter model.
π Model Overview
Standard math-specific LLMs frequently suffer from domain overfitting. When prompted with basic conversational queries, they either hallucinate lengthy pseudo-proofs or fail entirely to understand the user's intent. Xerv-AI/Ada was meticulously engineered to resolve this by utilizing a carefully balanced, dual-distribution training dataset, allowing it to act as both a rigorous STEM assistant and a general-purpose chat model.
| Specification | Details |
|---|---|
| Model Name | Xerv-AI/Ada |
| Base Architecture | unsloth/Qwen2.5-Math-1.5B |
| Parameter Count | 1.5 Billion |
| Primary Capabilities | Graduate-level STEM reasoning, logical deduction, and mathematical proofs. |
| Secondary Capabilities | General conversational instruction-following, roleplay, and basic coding. |
| Training Framework | QLoRA via Unsloth (Triton kernels). |
| Precision | Merged 16-bit (Fine-tuned in 4-bit). |
| License | Apache-2.0 |
| Dataset | Sample Size |
| :--- | :--- |
| Xerv-AI/GRAD | ~1.93k rows |
| yahma/alpaca-cleaned | ~2.00k rows |
π» Usage & Python Inference Guide
The model is highly responsive to the standard Alpaca Instruction/Response template. Important Inference Note: For best results, use a repetition_penalty of roughly 1.15. This acts as a crucial guardrail to prevent the model from infinitely looping through mathematical steps on overly simple arithmetic queries. 1. Installation Requirements
pip install unsloth transformers accelerate torch
2. Fast Inference Script
from unsloth import FastLanguageModel
import torch
# Configuration
repo_name = "Xerv-AI/Ada"
max_seq_length = 2048
# Load the model and tokenizer (4-bit recommended for low-VRAM)
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = repo_name,
max_seq_length = max_seq_length,
dtype = None,
load_in_4bit = True,
)
# Enable optimized inference mode
FastLanguageModel.for_inference(model)
# Define the universal prompt template
universal_prompt = """### Instruction:
{}
### Response:
{}"""
# Prepare your query
query = "Provide a step-by-step logical proof finding the eigenvalues of the matrix [[2, 1], [1, 2]]."
inputs = tokenizer(
[universal_prompt.format(query, "")],
return_tensors = "pt"
).to("cuda")
print("Generating analytical response...")
# Generate the output
outputs = model.generate(
**inputs,
max_new_tokens = 1024,
max_length = None,
use_cache = True,
repetition_penalty = 1.15, # Critical: prevents generation loops
pad_token_id = tokenizer.eos_token_id
)
# Decode and print the result
response = tokenizer.batch_decode(outputs, skip_special_tokens = True)[0]
print(f"\n{'='*50}\nOutput:\n{'='*50}")
print(response.split("### Response:\n")[-1])
π‘οΈ Safety & Alignment Guardrails
Despite being fine-tuned on raw mathematical logic and conversational instruction data, Ada successfully retains its foundational safety alignments. Because only 1% to 2% of the parameters were actively updated via LoRA (and subsequently merged), the original base Qwen2.5 weights responsible for safety remain fully intact.
- Content Moderation: The model actively refuses to generate explicit, illegal, or harmful content, relying on the RLHF and DPO safety guardrails instilled during Alibaba's original pre-training phase.
β οΈ Limitations & Known Biases
While Ada punches well above its 1.5B weight class, it is important to acknowledge the limitations inherent to Small Language Models:
- Arithmetic Hallucinations: Ada is exceptionally capable at symbolic logic, structural breakdowns, and mathematical theory. However, like many SLMs, it can occasionally suffer from minor arithmetic errors (e.g., basic addition/subtraction mistakes) deep within multi-page proofs. Always verify raw calculations.
- Language Constraint: The model is optimized exclusively for English text and standard mathematical notation.
- Prompt Sensitivity: Ada performs at its absolute peak when math queries explicitly ask for a "proof," "step-by-step breakdown," or "logical analysis" within the instruction block.
- World Knowledge: It lacks the broad, encyclopedic trivia knowledge found in massive 70B+ parameter models.
π€ Acknowledgements
- Alibaba Cloud: For the phenomenal, state-of-the-art base Qwen2.5-Math architecture.
- Unsloth AI: For the Triton-optimized training kernels that made compiling and fine-tuning this model possible and highly efficient on consumer hardware.
- Xerv-AI: For the curation of the GRAD synthetic dataset powering the advanced reasoning capabilities.
- Downloads last month
- 439