Collection of Quantized Models for MoE
Krishna Teja Chitty-Venkata
AI & ML interests
LLM Optimization, Neural Architecture Search, Quantization, Pruning
Recent Activity
updated
a model about 2 hours ago
inference-optimization/Llama-3.1-8B-Instruct-6-bits published
a model about 2 hours ago
inference-optimization/Llama-3.1-8B-Instruct-6-bits updated
a model 1 day ago
RedHatAI/sarvam-105b-FP8-Dynamic