Models and datasets for food-focused vision-language tasks such as food detection, item extraction, and structured visual understanding.
-
berkeruveyik/FoodExtraqt-Vision-SmoLVLM2-500M-fine-tune-v3
Image-to-Text • 0.5B • Updated • 38 -
berkeruveyik/smolvlm2-256m-FoodExtract-Vision-v2-without-peft-stage-2
Image-to-Text • 0.3B • Updated • 50 -
berkeruveyik/smolvlm2-256m-FoodExtract-Vision-v2-without-peft
Image-to-Text • 0.3B • Updated • 32 -
berkeruveyik/smolvlm2-256m-FoodExtract-Vision-v1
Image-to-Text • Updated