nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl.
AI & ML interests
None defined yet.
Recent Activity
Models optimized and bench-marked for NVIDIA Jetson AGX Orin. Memory-efficient and latency-optimized variants designed for real-time edge inference.
Efficient Drop-In Replacement for the Classification Head in Language Model Inference.
Ultra-efficient model variants optimized for Jetson Orin Nano. Designed for constrained edge environments requiring low memory footprint.
Models validated and performance-optimized for NVIDIA Jetson AGX Thor. Tailored for high-performance edge AI workloads.
Quantization strategy where most weights are converted to INT4, activations remain in FP16, and sensitive layers are preserved in FP16.
nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl.
Ultra-efficient model variants optimized for Jetson Orin Nano. Designed for constrained edge environments requiring low memory footprint.
Models optimized and bench-marked for NVIDIA Jetson AGX Orin. Memory-efficient and latency-optimized variants designed for real-time edge inference.
Models validated and performance-optimized for NVIDIA Jetson AGX Thor. Tailored for high-performance edge AI workloads.
Efficient Drop-In Replacement for the Classification Head in Language Model Inference.
Quantization strategy where most weights are converted to INT4, activations remain in FP16, and sensitive layers are preserved in FP16.