Instructions to use google/owlv2-base-patch16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/owlv2-base-patch16 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("zero-shot-object-detection", model="google/owlv2-base-patch16")# Load model directly from transformers import AutoProcessor, AutoModelForZeroShotObjectDetection processor = AutoProcessor.from_pretrained("google/owlv2-base-patch16") model = AutoModelForZeroShotObjectDetection.from_pretrained("google/owlv2-base-patch16") - Notebooks
- Google Colab
- Kaggle
Image Labels: One-shot image-conditioned object detection
#5
by godaspeg - opened
Is it possible to detect objects using images as labels instead of texts? As OwlVIT is based on CLIP Embeddings, I think this should be theoretically possible.
godaspeg changed discussion title from Image Labels to Image Labels: One-shot image-conditioned object detection
Yes, image-guided object detection is supported, see the demo notebook: https://github.com/NielsRogge/Transformers-Tutorials/blob/master/OWLv2/Zero_and_one_shot_object_detection_with_OWLv2.ipynb
godaspeg changed discussion status to closed