Merlin Research

non-profit
Activity Feed

AI & ML interests

Independent AI safety lab. Stockholm, Sweden. We test deployed LLM agents under adversarial conditions and measure behavioral alignment in production — not in controlled benchmarks.

Recent Activity

squ11z1  updated a Space 2 days ago
MerlinSafety/README
squ11z1  published a Space 2 days ago
MerlinSafety/README
View all activity

Organization Card

MerlinRe

Merlin Research

Merlin Research is an independent AI safety and reasoning research organization focused on building practical, auditable, and robust open models.

Mission

We develop and evaluate models that are:

  • Strong in constrained instruction-following
  • Safer in real-world agentic workflows
  • Better aligned under uncertainty and adversarial prompts
  • Transparent in behavior, limits, and deployment risks

What We Build

  • Safety-oriented reasoning models
  • Alignment-focused post-training pipelines
  • Evaluation suites for robustness, controllability, and failure analysis
  • Open artifacts for reproducible research

Current Focus Areas

  • Safety reasoning for small/efficient LLMs
  • Misalignment reduction via structured post-training
  • Hallucination risk reduction in high-stakes contexts
  • Robust instruction adherence with explicit constraints

Research Principles

  1. Measure behavior, not marketing claims.
  2. Prioritize reproducibility and clear documentation.
  3. Publish limitations, not only strengths.
  4. Design for safe deployment from day one.

Models

Our flagship releases are published under this organization with:

  • Full model cards
  • Clear training/deployment notes
  • Practical usage guidance

Collaboration

We welcome collaboration on:

  • AI safety evaluation
  • Alignment methods
  • Reasoning benchmarks
  • Responsible open model deployment

For partnerships or research collaboration, contact us via Hugging Face discussions or linked channels in our repositories.


Merlin Research
Safe reasoning. Measurable alignment. Real-world robustness.

datasets 0

None public yet