Alex Steiner
einsteiner1983
AI & ML interests
Data Science
Organizations
KeyError: '110.w1.input_scale' with TRT
2
#3 opened 4 months ago
by
guanwenyu1995
https://huggingface.co/nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8
2
#7 opened 10 months ago
by
einsteiner1983
Where is the config.json?
โ 4
1
#17 opened 10 months ago
by
einsteiner1983
Run 1T-param on A100/H100(80G)x8 using FP4
๐๐ฅ 5
7
#9 opened 11 months ago
by
ghostplant
How to combine `thinking on/off` prompt with existing system prompt.
2
#8 opened about 1 year ago
by
michaelfeil
When input tokens < 4096 but total input+output tokens >4096 the model produces poor output
๐ 2
7
#85 opened almost 2 years ago
by
einsteiner1983