SentenceTransformer based on google/embeddinggemma-300m
This is a sentence-transformers model finetuned from google/embeddinggemma-300m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: google/embeddinggemma-300m
- Maximum Sequence Length: 2048 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(4): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("yasserrmd/oncology-gemma-300m-emb")
# Run inference
queries = [
"What are the current standard treatments for glioblastoma multiforme (GBM) and why is recurrence almost unavoidable?\n",
]
documents = [
'The current standard treatment for GBM includes surgery, radiotherapy, and chemotherapy. However, complete surgical resection is not possible, and GBM is resistant to chemotherapy, including the commonly used drug temozolomide (TMZ). This resistance and the inability to completely remove the tumor during surgery contribute to the high recurrence rate of GBM.',
'The overexpression of GALNT2 in oral squamous cell carcinoma (OSCC) cells can promote their invasive potential. GALNT2 modifies the O-glycosylation of proteins and increases the activity of epidermal growth factor receptor (EGFR), which plays a crucial role in the invasive behavior of OSCC cells. This suggests that GALNT2 may be involved in the occurrence and development of OSCC.',
'The main mechanisms responsible for oncogene-mediated drug resistance in ovarian cancer include deregulation of apoptosis, altered phosphorylation (intracellular signaling), and metabolic pathways. Activation of the PI3K/AKT cell survival pathway, as well as deregulation of growth factor receptors mediated by NF-kB and STAT3, plays a pivotal role in drug resistance. Additionally, alterations in DNA damage and repair mechanisms, impaired apoptotic machinery, and epithelial-to-mesenchymal transition (EMT) have been implicated in drug resistance. Wnt signaling, particularly the β-catenin-independent pathway via Wnt5a/ROR1/ROR2, is also involved in EMT and chemoresistance. Targeting these pathways may offer potential means to overcome drug resistance in ovarian cancer.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[ 0.7010, 0.0508, -0.0444]])
Training Details
Training Dataset
Unnamed Dataset
- Size: 20,000 training samples
- Columns:
sentence_0andsentence_1 - Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 type string string details - min: 10 tokens
- mean: 22.55 tokens
- max: 51 tokens
- min: 18 tokens
- mean: 91.28 tokens
- max: 219 tokens
- Samples:
sentence_0 sentence_1 Is there a way to prevent PTLD in high-risk patients?Currently, there is no convincing data for the prophylaxis of PTLD. However, the case mentioned suggests that early use of rituximab after HSCT (Hematopoietic Stem Cell Transplantation) could be a good way to prevent PTLD in high-risk patients, especially those who are serum EBV (Epstein-Barr Virus) positive. Early recognition of PTLD, early lymph node biopsy, and early diagnosis are key factors in the successful treatment of PTLD.How does the 34-gene 'CTC profile' contribute to the prognostic power of breast cancer patients?The 34-gene 'CTC profile' has been found to be predictive of CTC status in breast cancer patients. It demonstrated a classification accuracy of 82% in the training cohort and 67% in an independent microarray dataset. Furthermore, it has been shown to be prognostic in both independent datasets, with a hazard ratio (HR) of 10 in the first validation dataset and a HR of 3.2 in the second validation dataset. Importantly, multivariate analysis confirmed that the CTC profile provided prognostic information independent of other clinical variables in both patient cohorts.How are beauty care services for cancer patients organized and provided?Beauty care services for cancer patients are not standardized or evaluated and vary from one establishment to another. In the case of the IGR, consultations on image advice and socio-aesthetics are provided by a socio-aesthetician who has been trained as a personal image advisor. These consultations are offered to women with breast cancer or young adults and adolescents with cancer who are referred by medical units. The consultations take place in a dedicated area with three rooms: an office, make-up parlor, and beauty care salon. Patients are usually seen multiple times during their treatment period. The socio-aesthetician is paid by the hospital and is part of the Onco-hematology Interdisciplinary Supportive Care Directorate. - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 4per_device_eval_batch_size: 4num_train_epochs: 1multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 4per_device_eval_batch_size: 4per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}
Training Logs
| Epoch | Step | Training Loss |
|---|---|---|
| 0.1 | 500 | 0.0144 |
| 0.2 | 1000 | 0.0293 |
| 0.3 | 1500 | 0.0128 |
| 0.4 | 2000 | 0.0153 |
| 0.5 | 2500 | 0.0182 |
| 0.6 | 3000 | 0.008 |
| 0.7 | 3500 | 0.0098 |
| 0.8 | 4000 | 0.0044 |
| 0.9 | 4500 | 0.0024 |
| 1.0 | 5000 | 0.0019 |
Framework Versions
- Python: 3.12.11
- Sentence Transformers: 5.1.0
- Transformers: 4.56.1
- PyTorch: 2.8.0+cu128
- Accelerate: 1.10.1
- Datasets: 4.0.0
- Tokenizers: 0.22.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 7
Model tree for yasserrmd/oncology-gemma-300m-emb
Base model
google/embeddinggemma-300m