view article Article "The Child That Surpassed Both Parents Through MRI-Guided Evolutionary Merge" 2 days ago ⢠13
view article Article Introducing WM Bench: A Benchmark for Cognitive Intelligence in World Models 4 days ago ⢠13
view article Article šļø Smol AI WorldCup: A 5-Axis Benchmark That Reveals What Small Language Models Can Really Do 23 days ago ⢠38
view article Article MARL: Runtime Middleware That Reduces LLM Hallucination Without Fine-Tuning 24 days ago ⢠15
view article Article Structural Problems in AI Benchmarking and the Case for a Unified Evaluation Framework 26 days ago ⢠12
FINAL Bench Collection World's First Functional Metacognition Benchmark. "Not how much AI knows ā but whether it knows what it doesn't know, and can fix it." ⢠2 items ⢠Updated Feb 21 ⢠4
view article Article Do Bubbles Form When Tens of Thousands of AIs Simulate Capitalism? Feb 24 ⢠17