Jonna Matthiesen
AI & ML interests
None yet
Recent Activity
updated a Space 1 day ago
embedl/Edge-Inference-Benchmarks liked a model 7 days ago
embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead posted an update 7 days ago
ā” FlashHead benchmarks for Llama 3.2, Gemma 3, and Qwen3 are now on https://huggingface.co/spaces/embedl/Edge-Inference-Benchmarks !
These are some of the models used in the FlashHead paper - now easier to explore and compare interactively.
š Jetson AGX Thor (tok/s, batch=1):
- Llama-3.2-1B: 77 ā 285 (FlashHead+W4A16, 3.7x)
- Llama-3.2-3B: 34 ā 112 (3.3x)
- Gemma-3-1B: 79 ā 153 (1.9x)
- Qwen3-1.7B: 49 ā 189 (3.8x)
- Qwen3-0.6B: 140 ā 177 (1.3x)
ā
Accuracy matches baseline on MMLU-Pro, IFEval, BBH, TruthfulQA, GSM8K.