nm-testing/Meta-Llama-3-8B-Instruct-MXFP4A16-GPTQ
nm-testing/Speculator-Qwen3-30B-MOE-VL-Eagle3
0.4B • Updated • 323
nm-testing/Qwen3-0.6B-FP8_BLOCK
0.6B • Updated • 3
nm-testing/Qwen3-0.6B-W4A16-G128
0.6B • Updated • 2
nm-testing/Llama-3.2-1B-Instruct-DEBUG-STRAWBERRY
nm-testing/Llama-3.2-1B-Instruct-DEBUG-COUNTER
nm-testing/TinyLlama-1.1B-compressed-tensors-kv-cache-scheme
Text Generation
• 1B • Updated • 293
nm-testing/TinyLlama-1.1B-Chat-v1.0-kvcache-fp8-attn_head
nm-testing/TinyLlama-1.1B-Chat-v1.0-kvcache-fp8-tensor
1B • Updated • 7.21k
nm-testing/Meta-Llama-3-8B-Instruct-awq-NVFP4
nm-testing/testing-llama3.1.8b-2layer-eagle3
nm-testing/CDH-test-nvfp4-awq
5B • Updated • 1
nm-testing/granite-4.0-h-small-FP8-dynamic
Text Generation
• 32B • Updated • 2
nm-testing/tinysmokeqwen3moe-W4A16-first-only-CTstable
2.93M • Updated • 17.5k
nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Qwen3-32B-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.1-8B-Instruct-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.1-8B-Instruct-QKV-Cache-FP8-Per-Head
Updated
nm-testing/DeepSeek-R1-Distill-Qwen-32B-NVFP4
Text Generation
• 19B • Updated • 860
• 3
nm-testing/tinysmokeqwen3moe-W4A16-first-only
2.93M • Updated • 5
nm-testing/tinysmokeqwen3moe
2.93M • Updated • 1.97k
nm-testing/Meta-Llama-3-8B-Instruct-MXFP4
5B • Updated • 2