Mitko Vasilev's picture

Mitko Vasilev

mitkox

·

AI & ML interests

Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.

Recent Activity

posted an update 16 days ago

I run 20 AI coding agents locally on my desktop workstation at 400+ tokens/sec with MiniMax-M2. It’s a Sonnet drop-in replacement in my Cursor, Claude Code, Droid, Kilo and Cline peak at 11k tok/sec input and 433 tok/s output, can generate 1B+ tok/m.All with 196k context window. I'm running it for 6 days now with this config. Today max performance was stable at 490.2 tokens/sec across 48 concurrent clients and MiniMax M2. Z8 Fury G5, Xeon 3455, 4xA6K. Aibrix 0.5.0, vLLM 0.11.2,

posted an update 30 days ago

I just threw Qwen3-0.6B in BF16 into an on device AI drag race on AMD Strix Halo with vLLM: 564 tokens/sec on short 100-token sprints 96 tokens/sec on 8K-token marathons TL;DR You don't just run AI on AMD. You negotiate with it. The hardware absolutely delivers. Spoiler alert; there is exactly ONE configuration where vLLM + ROCm + Triton + PyTorch + Drivers + Ubuntu Kernel to work at the same time. Finding it required the patience of a saint Consumer AMD for AI inference is the ultimate "budget warrior" play, insane performance-per-euro, but you need hardcore technical skills that would make a senior sysadmin nod in quiet respect.

posted an update about 1 month ago

I have just vibe coded a feature for ODA on-device AI with MiniMax M2, running locally on my Z8 Fury - and holy silicon, this thing SLAPS! TL;DR the nerd stuff Specialized in coding and agentic work 60 tokens/sec Ryzen AI is getting some serious ROCm 7.0.2 brain implants One extra script to rule them all and bind them to my GPU Vibe coding feature implementation that actually worked on the first try. I know, I'm scared too

View all activity

Organizations

liked a model about 2 months ago

Kwaipilot/KAT-Dev-72B-Exp

Text Generation • 73B • Updated Oct 13 • 680 • 155

liked a model 4 months ago

deepseek-ai/DeepSeek-V3.1-Base

Text Generation • 685B • Updated Aug 26 • 7.78k • 1k

liked a model 5 months ago

moonshotai/Kimi-K2-Instruct

Text Generation • 1T • Updated Nov 7 • 170k • • 2.27k

liked a model 6 months ago

deepseek-ai/DeepSeek-R1-0528

Text Generation • 685B • Updated May 29 • 421k • • 2.39k

liked a model 7 months ago

fdtn-ai/Foundation-Sec-8B

Text Generation • 8B • Updated Aug 26 • 6.97k • • 275

liked 5 models 8 months ago

tngtech/DeepSeek-R1T-Chimera

Text Generation • 685B • Updated Nov 4 • 690 • 265

NousResearch/Minos-v1

Text Classification • 0.4B • Updated Apr 28 • 1.59k • • 166

facebook/blt

Updated Apr 30 • 26 • 73

facebook/blt-7b

Updated May 1 • 155 • 61

nvidia/Llama-3_1-Nemotron-Ultra-253B-v1

Text Generation • 253B • Updated Oct 15 • 119k • • 339

liked a dataset 8 months ago

nvidia/OpenCodeReasoning

Viewer • Updated May 4 • 753k • 3.2k • 514

liked a model 8 months ago

nomic-ai/colnomic-embed-multimodal-7b

Visual Document Retrieval • Updated Apr 15 • 16.6k • 93

liked a dataset 8 months ago

virtuoussy/Multi-subject-RLVR

Viewer • Updated Apr 16 • 579k • 222 • 66

liked 4 models 9 months ago

Qwen/Qwen2.5-Omni-7B

Any-to-Any • 11B • Updated Apr 30 • 139k • 1.83k

deepseek-ai/DeepSeek-V3-0324

Text Generation • 685B • Updated Mar 27 • 142k • • 3.08k

unsloth/QwQ-32B-GGUF

Text Generation • 33B • Updated Apr 27 • 3.47k • 86

Qwen/QwQ-32B

Text Generation • 33B • Updated Mar 11 • 56.4k • • 2.87k

liked 2 datasets 10 months ago

PrimeIntellect/SYNTHETIC-1

Viewer • Updated Feb 21 • 1.99M • 850 • 60

open-r1/OpenR1-Math-Raw

Viewer • Updated Feb 24 • 516k • 510 • 76

liked a model 11 months ago

mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.0

Text Generation • 2B • Updated Jan 29 • 84 • 44