Model Card for genevera/Qwen3-VL-32B-Instruct-Heretic-FP8-DYNAMIC

This is an FP8 quant of coder3101/Qwen3-VL-32B-Instruct-Heretic.

vllm (pretrained=genevera/Qwen3-VL-32B-Instruct-Heretic-FP8-DYNAMIC,add_bos_token=True,max_model_len=161184,dtype=bfloat16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto
|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value|   |Stderr|
|-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.924|±  |0.0168|
|     |       |strict-match    |     5|exact_match|↑  |0.928|±  |0.0164|
Downloads last month
116
Safetensors
Model size
33B params
Tensor type
BF16
·
F8_E4M3
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for genevera/Qwen3-VL-32B-Instruct-Heretic-FP8-DYNAMIC

Quantized
(3)
this model
Quantizations
1 model