Collection of Quantized Models for MoE
Krishna Teja Chitty-Venkata
krishnateja95
AI & ML interests
LLM Optimization, Neural Architecture Search, Quantization, Pruning
Recent Activity
updated
a model
2 days ago
inference-optimization/Llama-3.1-8B-Instruct-QKV-Cache-FP8-Per-Head
updated
a model
3 days ago
inference-optimization/Llama-3.1-8B-Instruct-Mixed-NVFP4-FP8_DYNAMIC-gate_up_proj-all
updated
a model
3 days ago
inference-optimization/Llama-3.1-8B-Instruct-Mixed-NVFP4-FP8_DYNAMIC-down_proj-all