SentenceTransformer based on google/embeddinggemma-300m
This is a sentence-transformers model finetuned from google/embeddinggemma-300m. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: google/embeddinggemma-300m
- Maximum Sequence Length: 2048 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(4): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("yasserrmd/pharma-gemma-300m-emb")
# Run inference
queries = [
"What is the purpose of dabigatran and why was it prescribed to the 71-year-old female?",
]
documents = [
'Dabigatran is prescribed for stroke prevention in patients with atrial fibrillation. Atrial fibrillation increases the risk of blood clots forming in the heart, which can then travel to the brain and cause a stroke. Dabigatran is an anticoagulant that helps prevent the formation of blood clots, reducing the risk of stroke in patients with atrial fibrillation.',
'G-protein coupled receptors, like OXTR and MOR, can form homo- or hetero-dimers, which means they can associate with another molecule of the same receptor or with receptors from other families. This physical association has been shown to modulate receptor binding and function. For example, in MOR-alpha2A-adrenergic receptor dimers, the activation of MOR by morphine inhibits the adjacent alpha2A-receptor by blocking its ability to activate the G-proteins, even in the presence of noradrenaline.',
'Melatonin agonists may have side effects such as nausea, headache, elevated liver enzyme levels, rebound insomnia, withdrawal symptoms, and addiction. Contraindications include liver failure, renal failure, alcohol addiction, and high lipid levels.',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[ 0.4430, 0.0378, -0.0539]])
Training Details
Training Dataset
Unnamed Dataset
- Size: 20,000 training samples
- Columns:
sentence_0andsentence_1 - Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 type string string details - min: 10 tokens
- mean: 21.09 tokens
- max: 48 tokens
- min: 20 tokens
- mean: 94.97 tokens
- max: 223 tokens
- Samples:
sentence_0 sentence_1 How does ticlopidine differ from clopidogrel in terms of side effects and precautions?Unlike clopidogrel, ticlopidine can lead to neutropenia in up to 1% of patients, which limits its widespread use. Regular blood count checks are necessary in the initial weeks of ticlopidine treatment. Additionally, neuraxial regional anesthesia should not be performed until 10 days have elapsed since the last ingestion of ticlopidine.What are the different types of ligands that can bind to GPCRs?GPCRs can bind a wide variety of endogenous ligands, including neuropeptides, amino acids, ions, hormones, chemokines, lipid-derived mediators, and ions. Some GPCRs are considered orphan receptors because their exact ligands have not been identified yet.How does etomidate function as an adrenostatic agent and what are its effects on cortisol secretion?Etomidate acts as an adrenostatic agent by blocking the cytochrome P450-dependent adrenal enzymes 11β-hydroxylase and cholesterol-side-chain cleavage enzyme. This inhibition leads to a decrease in cortisol secretion. In dispersed guinea-pig adrenal cells, etomidate has been shown to be the most potent adrenostatic drug available, with a mean concentration of 97 nmol/l required for 50% inhibition of cortisol secretion. This concentration is considerably lower than the plasma concentration needed to induce sedation. After a single induction dose of etomidate, the adrenocortical blockade lasts several hours while the hypnotic action of etomidate rapidly fades. - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 4per_device_eval_batch_size: 4num_train_epochs: 1multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 4per_device_eval_batch_size: 4per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}
Training Logs
| Epoch | Step | Training Loss |
|---|---|---|
| 0.1 | 500 | 0.0134 |
| 0.2 | 1000 | 0.009 |
| 0.3 | 1500 | 0.0138 |
| 0.4 | 2000 | 0.0052 |
| 0.5 | 2500 | 0.0154 |
| 0.6 | 3000 | 0.0076 |
| 0.7 | 3500 | 0.0062 |
| 0.8 | 4000 | 0.0021 |
| 0.9 | 4500 | 0.0028 |
| 1.0 | 5000 | 0.0015 |
Framework Versions
- Python: 3.12.11
- Sentence Transformers: 5.1.0
- Transformers: 4.56.1
- PyTorch: 2.8.0+cu128
- Accelerate: 1.10.1
- Datasets: 4.0.0
- Tokenizers: 0.22.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 11
Model tree for yasserrmd/pharma-gemma-300m-emb
Base model
google/embeddinggemma-300m