Spaces:

optiviseapp
/

fnmodel

Paused

App Files Files Community

47.5 kB

2 contributors

History: 36 commits

aeb56

Monkey-patch transformers to disable flash attention via wrapper script

2900b36 28 days ago

.gitattributes

1.52 kB

initial commit about 1 month ago
.gitignore

543 Bytes

Initial commit: LoRA model merger about 1 month ago
Dockerfile

1.1 kB

Switch to vLLM for high-performance, stable inference about 1 month ago
README.md

4.47 kB

Aggressive memory cleanup: 5s wait, env vars, optional model loading 28 days ago
README_inference.md

2.66 kB

Transform Space into professional inference UI for fine-tuned model about 1 month ago
app.py

20.2 kB

Monkey-patch transformers to disable flash attention via wrapper script 28 days ago
inference_app.py

11.9 kB

Transform Space into professional inference UI for fine-tuned model about 1 month ago
merge_script.py

4.8 kB

Implement manual LoRA merging to fix PEFT key naming conflicts about 1 month ago
requirements.txt

356 Bytes

Workaround flash-attn: create fake module with PyTorch fallback attention 28 days ago