MoE Merged: Qwen2.5-1.5B + Phi-2

Mixture of Experts model merging Qwen2.5-1.5B-Instruct and Phi-2.

Parameters

  • Total: ~4.4B
  • Trainable: ~1.8B (routers + projections)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Echoes123-3/qwen-phi-moe")
tokenizer = AutoTokenizer.from_pretrained("Echoes123-3/qwen-phi-moe")

prompt = "What is AI?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0]))
Downloads last month
21
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support