jinaai
/

jina-embeddings-v4

@@ -26,13 +26,15 @@ Embeddings produced by `jina-embeddings-v4` serve as the backbone for neural inf
 Built based on [Qwen/Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct), `jina-embeddings-v4` has the following features:
-- **Unified embeddings** for text, images, and documents, supporting both dense (single-vector) and late-interaction (multi-vector) retrieval.
 - **Multilingual support** (20+ languages) and compatibility with a wide range of domains, including technical and visually complex documents.
 - **Task-specific adapters** for retrieval, text matching, and code-related tasks, which can be selected at inference time.
 - **Flexible embedding size**: dense embeddings are 2048 dimensions by default but can be truncated to as low as 128 with minimal performance loss.
 Summary of features:
 | Feature   | Jina Embeddings V4   |
 |------------|------------|
 | Base Model | Qwen2.5-VL-3B-Instruct |
@@ -42,8 +44,8 @@ Summary of features:
 | Single-Vector Dimension | 2048 |
 | Multi-Vector Dimension | 128 |
 | Matryoshka dimensions | 128, 256, 512, 1024, 2048 |
-| Attention Mechanism | FlashAttention2 |
 | Pooling Strategy | Mean pooling |
@@ -58,6 +60,7 @@ Please refer to our [technical report of jina-embeddings-v4](https://puginarug.c
   <summary>Requirements</a></summary>
 The following Python packages are required:
 - `transformers>=4.52.0`
 - `torch>=2.6.0`
 - `peft>=0.15.2`
@@ -68,25 +71,21 @@ The following Python packages are required:
 - **flash-attention**: Installing [flash-attention](https://github.com/Dao-AILab/flash-attention) is recommended for improved inference speed and efficiency, but not mandatory.
 - **sentence-transformers**: If you want to use the model via the `sentence-transformers` interface, install this package as well.
 </details>
 <details>
-  <summary>via Jina AI <a href="https://jina.ai/embeddings/">Embedding API</a></summary>
-Needs to be adjusted for V4
 ```bash
 curl https://api.jina.ai/v1/embeddings \
   -H "Content-Type: application/json" \
-  -H "Authorization: Bearer [JINA_AI_API_TOKEN]" \
   -d @- <<EOFEOF
   {
     "model": "jina-embeddings-v4",
-    "dimensions": 1024,
-    "task": "retrieval.query",
-    "normalized": true,
-    "embedding_type": "float",
     "input": [
         {
             "text": "غروب جميل على الشاطئ"
@@ -136,37 +135,41 @@ EOFEOF
 ```python
 # !pip install transformers>=4.52.0 torch>=2.6.0 peft>=0.15.2 torchvision pillow
-# !pip install
 from transformers import AutoModel
 # Initialize the model
 model = AutoModel.from_pretrained("jinaai/jina-embeddings-v4", trust_remote_code=True)
 # ========================
 # 1. Retrieval Task
 # ========================
 # Configure truncate_dim, max_length (for texts), max_pixels (for images), vector_type, batch_size in the encode function if needed
 # Encode query
-query_embedding = model.encode_texts(
     texts=["Overview of climate change impacts on coastal cities"],
     task="retrieval",
     prompt_name="query",
-)[0]
 # Encode passage (text)
-passage_embedding = model.encode_texts(
     texts=[
         "Climate change has led to rising sea levels, increased frequency of extreme weather events..."
     ],
     task="retrieval",
     prompt_name="passage",
-)[0]
 # Encode image/document
-image_embedding = model.encode_images(
     images=["https://i.ibb.co/nQNGqL0/beach1.jpg"],
     task="retrieval",
-)[0]
 # ========================
 # 2. Text Matching Task
@@ -183,25 +186,43 @@ texts = [
     "해변 위로 아름다운 일몰",  # Korean
 ]
-text_embeddings = model.encode_texts(texts=texts, task="text-matching")
 # ========================
 # 3. Code Understanding Task
 # ========================
 # Encode query
-query_embedding = model.encode_texts(
     texts=["Find a function that prints a greeting message to the console"],
     task="code",
     prompt_name="query",
 )
 # Encode code
-code_embeddings = model.encode_texts(
     texts=["def hello_world():\n    print('Hello, World!')"],
     task="code",
     prompt_name="passage",
 )
 ```
 </details>

 Built based on [Qwen/Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct), `jina-embeddings-v4` has the following features:
+- **Unified embeddings** for text, images, and visual documents, supporting both dense (single-vector) and late-interaction (multi-vector) retrieval.
 - **Multilingual support** (20+ languages) and compatibility with a wide range of domains, including technical and visually complex documents.
 - **Task-specific adapters** for retrieval, text matching, and code-related tasks, which can be selected at inference time.
 - **Flexible embedding size**: dense embeddings are 2048 dimensions by default but can be truncated to as low as 128 with minimal performance loss.
 Summary of features:
 | Feature   | Jina Embeddings V4   |
 |------------|------------|
 | Base Model | Qwen2.5-VL-3B-Instruct |
 | Single-Vector Dimension | 2048 |
 | Multi-Vector Dimension | 128 |
 | Matryoshka dimensions | 128, 256, 512, 1024, 2048 |
 | Pooling Strategy | Mean pooling |
+| Attention Mechanism | FlashAttention2 |
   <summary>Requirements</a></summary>
 The following Python packages are required:
 - `transformers>=4.52.0`
 - `torch>=2.6.0`
 - `peft>=0.15.2`
 - **flash-attention**: Installing [flash-attention](https://github.com/Dao-AILab/flash-attention) is recommended for improved inference speed and efficiency, but not mandatory.
 - **sentence-transformers**: If you want to use the model via the `sentence-transformers` interface, install this package as well.
 </details>
 <details>
+  <summary>via <a href="https://jina.ai/embeddings/">Jina AI Embeddings API</a></summary>
 ```bash
 curl https://api.jina.ai/v1/embeddings \
   -H "Content-Type: application/json" \
+  -H "Authorization: Bearer $JINA_AI_API_TOKEN" \
   -d @- <<EOFEOF
   {
     "model": "jina-embeddings-v4",
+    "task": "text-matching",
     "input": [
         {
             "text": "غروب جميل على الشاطئ"
 ```python
 # !pip install transformers>=4.52.0 torch>=2.6.0 peft>=0.15.2 torchvision pillow
+# !pip install
 from transformers import AutoModel
+import torch
 # Initialize the model
 model = AutoModel.from_pretrained("jinaai/jina-embeddings-v4", trust_remote_code=True)
+model.to("cuda")
 # ========================
 # 1. Retrieval Task
 # ========================
 # Configure truncate_dim, max_length (for texts), max_pixels (for images), vector_type, batch_size in the encode function if needed
 # Encode query
+query_embeddings = model.encode_text(
     texts=["Overview of climate change impacts on coastal cities"],
     task="retrieval",
     prompt_name="query",
+)
 # Encode passage (text)
+passage_embeddings = model.encode_text(
     texts=[
         "Climate change has led to rising sea levels, increased frequency of extreme weather events..."
     ],
     task="retrieval",
     prompt_name="passage",
+)
 # Encode image/document
+image_embeddings = model.encode_image(
     images=["https://i.ibb.co/nQNGqL0/beach1.jpg"],
     task="retrieval",
+)
 # ========================
 # 2. Text Matching Task
     "해변 위로 아름다운 일몰",  # Korean
 ]
+text_embeddings = model.encode_text(texts=texts, task="text-matching")
 # ========================
 # 3. Code Understanding Task
 # ========================
 # Encode query
+query_embedding = model.encode_text(
     texts=["Find a function that prints a greeting message to the console"],
     task="code",
     prompt_name="query",
 )
 # Encode code
+code_embeddings = model.encode_text(
     texts=["def hello_world():\n    print('Hello, World!')"],
     task="code",
     prompt_name="passage",
 )
+# ========================
+# 4. Use multivectors
+# ========================
+multivector_embeddings = model.encode_text(
+    texts=texts,
+    task="retrieval",
+    prompt_name="query",
+    return_multivector=True,
+)
+images = ["https://i.ibb.co/nQNGqL0/beach1.jpg", "https://i.ibb.co/r5w8hG8/beach2.jpg"]
+multivector_image_embeddings = model.encode_image(
+    images=images,
+    task="retrieval",
+    return_multivector=True,
+)
 ```
 </details>