MoE Merged: Qwen2.5-1.5B + Phi-2
Mixture of Experts model merging Qwen2.5-1.5B-Instruct and Phi-2.
Parameters
- Total: ~4.4B
- Trainable: ~1.8B (routers + projections)
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("Echoes123-3/qwen-phi-moe")
tokenizer = AutoTokenizer.from_pretrained("Echoes123-3/qwen-phi-moe")
prompt = "What is AI?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0]))
- Downloads last month
- 21
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support