Haakkim
AI & ML interests
None defined yet.
Recent Activity
An open arena-style human preference evaluation platform for Arabic large language models, covering 11 dialect varieties and ranked by a statistically rigorous Bradley–Terry model.
Ranked Arena
Random model pairing, single-turn MSA, matched system instruction. The only mode that feeds the official Bradley–Terry leaderboard.
✓ BT LeaderboardSide-by-Side
User-selected model pair, any dialect. Useful for targeted comparisons — excluded from ranked scoring to prevent selection bias.
Win-rate only10 Questions
Fixed Arabic prompt pool, any dialect. Provides consistent benchmarking within a curated question set.
Win-rate onlyHaakkim / Haakkim-1.0v
Battle records covering all 11 dialect varieties and all 3 evaluation modes. Includes full conversation transcripts, sampling weights, and category annotations.