AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
Embedl
Embedl develops advanced tools and algorithms for Edge AI. Our mission is to make AI models run faster, more energy-efficient, and reliably across diverse hardware platforms, while significantly reducing development time.
We help teams deploy high-performance AI on real-world, resource-constrained devices.
Embedl Models (Community)
Pre-optimized models that can be used off-the-shelf or customized for specific hardware target supported by the embedl-models package.
First release highlights:
- The fastest Small Language Models (SLMs) using FlashHead, a novel architectural improvement to the language-model head
- Works with popular models like Llama, Gemma, and Qwen
- Provides speedups on top of:
- Quantization
- Flash Attention
- Other standard optimizations
Device: Nvidia Jetson Thor
| Model | Generation speed (tokens/s) |
|---|---|
| embedl/Llama-3.2-3B-Instruct-FlashHead-W4A16 | 100 |
| Llama-3.2-3B-Instruct-W4A16* | 80 |
| RedHatAI/Llama-3.2-3B-Instruct-FP8 | 64 |
| meta-llama/Llama-3.2-3B-Instruct | 37 |
*Embedl quantized model for benchmarking similar to the FlashHead-W4A16 but without the faster FlashHead and custom generation loop.
Contact
Headquarters (Sweden)
Gamla Almedalsvägen 39
412 63 Gothenburg, Sweden
Email: [email protected]
nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl.
models 13
embedl/Cosmos-Reason2-2B-NVFP4A16
Image-Text-to-Text • 2B • Updated
• 243 • 2
embedl/Cosmos-Reason2-2B-W4A16
Image-Text-to-Text • 2B • Updated
• 7.34k • 7
embedl/Cosmos-Reason2-2B-W4A16-Edge2
Image-Text-to-Text • 2B • Updated
• 9.57k • 9
embedl/gemma-3-270m-it-FlashHead
Updated
• 25 • 4
embedl/Qwen3-0.6B-FlashHead
Updated
• 5 • 4
embedl/gemma-3-1b-it-FlashHead-W4A16
0.4B • Updated
• 3
embedl/Llama-3.2-3B-Instruct-FlashHead-W4A16
1B • Updated
• 13 • 4
embedl/Llama-3.2-1B-Instruct-FlashHead-W4A16
0.7B • Updated
• 4 • 6
embedl/Llama-3.2-1B-Instruct-FlashHead
1B • Updated
• 25 • 4
embedl/Llama-3.2-3B-Instruct-FlashHead
3B • Updated
• 22 • 4