Instructions to use limloop/MN-12B-Faun-RP-RU with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use limloop/MN-12B-Faun-RP-RU with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="limloop/MN-12B-Faun-RP-RU")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("limloop/MN-12B-Faun-RP-RU")
model = AutoModelForCausalLM.from_pretrained("limloop/MN-12B-Faun-RP-RU")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use limloop/MN-12B-Faun-RP-RU with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "limloop/MN-12B-Faun-RP-RU"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "limloop/MN-12B-Faun-RP-RU",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/limloop/MN-12B-Faun-RP-RU

SGLang

How to use limloop/MN-12B-Faun-RP-RU with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "limloop/MN-12B-Faun-RP-RU" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "limloop/MN-12B-Faun-RP-RU",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "limloop/MN-12B-Faun-RP-RU" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "limloop/MN-12B-Faun-RP-RU",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use limloop/MN-12B-Faun-RP-RU with Docker Model Runner:
```
docker model run hf.co/limloop/MN-12B-Faun-RP-RU
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

MN-12B-Faun-RP-RU

🇷🇺 Нажмите, чтобы развернуть описание на русском

🌟 О модели

MN-12B-Faun-RP-RU — улучшенный merge на базе Mistral Nemo 12B, развивающий идеи Hydra и ориентированный на:

🎭 Более стабильный и выразительный roleplay
📚 Улучшенный русский язык
🧠 Расширенный словарный запас, включая сложные и NSFW-темы
🔓 Практически отсутствующую цензуру

Модель собрана методом TIES-merging и не проходила дополнительного обучения после слияния.

🎯 Особенности

Основной фокус — русский язык
Лучше удерживает персонажей и стиль диалога
Более богатая и вариативная генерация
Улучшенная стабильность на длинных контекстах (проверено до ~8192 токенов)
Следует инструкциям, но может добавлять дисклеймеры на чувствительные запросы

⚠️ Важно

Модель сохраняет uncensored-характер, однако в некоторых случаях может добавлять предупреждения о неподходящем контенте при прямых запросах. При этом генерация не блокируется и продолжается после дисклеймера.

High-quality TIES merge based on Mistral Nemo 12B, focused on improved Russian fluency, stronger roleplay, richer vocabulary, and stable long-context performance.

🌍 Overview

MN-12B-Faun-RP-RU is an evolution of the Hydra-style merge, designed to push further in roleplay quality, language richness, and generation stability.

Key improvements include:

📚 Better Russian
🎭 More consistent and immersive roleplay behavior
🧠 Expanded vocabulary, including expressive and NSFW domains
🔁 More stable handling of long conversations (tested up to ~8k tokens)

The model may occasionally produce safety disclaimers when prompted directly for sensitive content, but generation continues normally afterward.

Built using TIES merging, which minimizes destructive interference between merged model weights.

🎯 Key Features

Feature	Description
Languages	Russian, English
Censorship	Mostly uncensored (with occasional disclaimers)
Roleplay	Improved consistency and immersion
Instruction Following	Strong
Vocabulary	Expanded, including NSFW domains
Context Length	Stable up to ~8192 tokens
Architecture	Mistral Nemo 12B

🧩 Model Composition

The merge combines the following models:

Model	Role in merge	Weight
MN-12B-Hydra-RP-RU	Base / foundation	0.60
Impish_Bloodmoon_12B	RP + style boost	0.25
Forgotten-Safeword-12B-v4.0	Uncensored behavior	0.10

Weights shown before normalization.

💡 Usage Example

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "limloop/MN-12B-Faun-RP-RU"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

prompt = "You are a mysterious forest faun speaking in poetic Russian."
messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)

outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

⚙️ Merge Details

Built using mergekit with the TIES method (Trim, Elect Sign, Merge).

Core mechanism:

Trim low-magnitude deltas via density
Resolve sign conflicts
Weighted averaging of aligned parameters

Merge Configuration

models:
  - model: limloop/MN-12B-Hydra-RP-RU
    parameters:
      weight: 0.6

  - model: SicariusSicariiStuff/Impish_Bloodmoon_12B
    parameters:
      weight: 0.25
      density: 0.9

  - model: ReadyArt/Forgotten-Safeword-12B-v4.0
    parameters:
      weight: 0.1
      density: 0.6

merge_method: ties
parameters:
  epsilon: 0.01
  normalize: true

base_model: limloop/MN-12B-Hydra-RP-RU
dtype: bfloat16

tokenizer:
  source: "base"