Embedl

Team

company

https://www.embedl.com

embedl

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

quantshah updated a model 3 days ago

embedl/sam3

quantshah updated a dataset 4 days ago

embedl/documentation-images

quantshah published a model 4 days ago

embedl/sam3

View all activity

embedl 's collections 6

FlashHead

Efficient Drop-In Replacement for the Classification Head in Language Model Inference. https://github.com/embedl/flash-head

embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead

Image-Text-to-Text • 2B • Updated 6 days ago • 1.09k • 7
embedl/Qwen3-1.7B-FlashHead-W4A16

2B • Updated 6 days ago • 140 • 3
embedl/gemma-3-270m-it-FlashHead

0.3B • Updated 6 days ago • 212 • 3
embedl/Qwen3-0.6B-FlashHead

0.6B • Updated 6 days ago • 85 • 4

Cosmos-Reason2

nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl.

embedl/Cosmos-Reason2-2B-W4A16-Edge2

Image-Text-to-Text • 2B • Updated 17 days ago • 19k • 12
embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead

Image-Text-to-Text • 2B • Updated 6 days ago • 1.09k • 7
embedl/Cosmos-Reason2-2B-NVFP4A16

Image-Text-to-Text • 2B • Updated Mar 2 • 21 • 1
embedl/Cosmos-Reason2-2B-W4A16

Image-Text-to-Text • 2B • Updated Mar 17 • 658 • 7

NVIDIA Jetson AGX Orin

Models optimized and bench-marked for NVIDIA Jetson AGX Orin. Memory-efficient and latency-optimized variants designed for real-time edge inference.

embedl/Cosmos-Reason2-2B-W4A16-Edge2

Image-Text-to-Text • 2B • Updated 17 days ago • 19k • 12
embedl/Cosmos-Reason2-2B-W4A16

Image-Text-to-Text • 2B • Updated Mar 17 • 658 • 7
embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead

Image-Text-to-Text • 2B • Updated 6 days ago • 1.09k • 7
Running

6

Edge Inference Benchmarks

🚀

6

On-Device benchmarks across devices and models.

EdgeN

Quantization strategy where most weights are converted to INT4, activations remain in FP16, and sensitive layers are preserved in FP16.

embedl/Cosmos-Reason2-2B-W4A16-Edge2

Image-Text-to-Text • 2B • Updated 17 days ago • 19k • 12
embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead

Image-Text-to-Text • 2B • Updated 6 days ago • 1.09k • 7

NVIDIA Jetson Orin Nano

Ultra-efficient model variants optimized for Jetson Orin Nano. Designed for constrained edge environments requiring low memory footprint.

embedl/Cosmos-Reason2-2B-W4A16-Edge2

Image-Text-to-Text • 2B • Updated 17 days ago • 19k • 12
embedl/Cosmos-Reason2-2B-W4A16

Image-Text-to-Text • 2B • Updated Mar 17 • 658 • 7
embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead

Image-Text-to-Text • 2B • Updated 6 days ago • 1.09k • 7
Running

6

Edge Inference Benchmarks

🚀

6

On-Device benchmarks across devices and models.

NVIDIA Jetson AGX Thor

Models validated and performance-optimized for NVIDIA Jetson AGX Thor. Tailored for high-performance edge AI workloads.

embedl/Cosmos-Reason2-2B-NVFP4A16

Image-Text-to-Text • 2B • Updated Mar 2 • 21 • 1
embedl/Cosmos-Reason2-2B-W4A16-Edge2

Image-Text-to-Text • 2B • Updated 17 days ago • 19k • 12
embedl/Cosmos-Reason2-2B-W4A16

Image-Text-to-Text • 2B • Updated Mar 17 • 658 • 7
embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead

Image-Text-to-Text • 2B • Updated 6 days ago • 1.09k • 7

FlashHead

Efficient Drop-In Replacement for the Classification Head in Language Model Inference. https://github.com/embedl/flash-head

embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead

Image-Text-to-Text • 2B • Updated 6 days ago • 1.09k • 7
embedl/Qwen3-1.7B-FlashHead-W4A16

2B • Updated 6 days ago • 140 • 3
embedl/gemma-3-270m-it-FlashHead

0.3B • Updated 6 days ago • 212 • 3
embedl/Qwen3-0.6B-FlashHead

0.6B • Updated 6 days ago • 85 • 4

EdgeN

Quantization strategy where most weights are converted to INT4, activations remain in FP16, and sensitive layers are preserved in FP16.

embedl/Cosmos-Reason2-2B-W4A16-Edge2

Image-Text-to-Text • 2B • Updated 17 days ago • 19k • 12
embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead

Image-Text-to-Text • 2B • Updated 6 days ago • 1.09k • 7

Cosmos-Reason2

nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl.

embedl/Cosmos-Reason2-2B-W4A16-Edge2

Image-Text-to-Text • 2B • Updated 17 days ago • 19k • 12
embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead

Image-Text-to-Text • 2B • Updated 6 days ago • 1.09k • 7
embedl/Cosmos-Reason2-2B-NVFP4A16

Image-Text-to-Text • 2B • Updated Mar 2 • 21 • 1
embedl/Cosmos-Reason2-2B-W4A16

Image-Text-to-Text • 2B • Updated Mar 17 • 658 • 7

NVIDIA Jetson Orin Nano

Ultra-efficient model variants optimized for Jetson Orin Nano. Designed for constrained edge environments requiring low memory footprint.

embedl/Cosmos-Reason2-2B-W4A16-Edge2

Image-Text-to-Text • 2B • Updated 17 days ago • 19k • 12
embedl/Cosmos-Reason2-2B-W4A16

Image-Text-to-Text • 2B • Updated Mar 17 • 658 • 7
embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead

Image-Text-to-Text • 2B • Updated 6 days ago • 1.09k • 7
Running

6

Edge Inference Benchmarks

🚀

6

On-Device benchmarks across devices and models.

NVIDIA Jetson AGX Orin

Models optimized and bench-marked for NVIDIA Jetson AGX Orin. Memory-efficient and latency-optimized variants designed for real-time edge inference.

embedl/Cosmos-Reason2-2B-W4A16-Edge2

Image-Text-to-Text • 2B • Updated 17 days ago • 19k • 12
embedl/Cosmos-Reason2-2B-W4A16

Image-Text-to-Text • 2B • Updated Mar 17 • 658 • 7
embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead

Image-Text-to-Text • 2B • Updated 6 days ago • 1.09k • 7
Running

6

Edge Inference Benchmarks

🚀

6

On-Device benchmarks across devices and models.

NVIDIA Jetson AGX Thor

Models validated and performance-optimized for NVIDIA Jetson AGX Thor. Tailored for high-performance edge AI workloads.

embedl/Cosmos-Reason2-2B-NVFP4A16

Image-Text-to-Text • 2B • Updated Mar 2 • 21 • 1
embedl/Cosmos-Reason2-2B-W4A16-Edge2

Image-Text-to-Text • 2B • Updated 17 days ago • 19k • 12
embedl/Cosmos-Reason2-2B-W4A16

Image-Text-to-Text • 2B • Updated Mar 17 • 658 • 7
embedl/Cosmos-Reason2-2B-W4A16-Edge2-FlashHead

Image-Text-to-Text • 2B • Updated 6 days ago • 1.09k • 7

AI & ML interests

Recent Activity

Team members 6

embedl 's collections 6

Edge Inference Benchmarks

Edge Inference Benchmarks

Edge Inference Benchmarks

Edge Inference Benchmarks