MPNet base trained on AllNLI triplets

This is a sentence-transformers model finetuned from intfloat/e5-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: intfloat/e5-base-v2
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Does amyloid peptide regulate calcium homoeostasis and arrhythmogenesis in pulmonary vein cardiomyocytes?',
    'Aβ 25 35 has direct electrophysiological effects on PV cardiomyocytes.',
    'Beta carotene has become popular in part because it s an antioxidant a substance that may protect cells from damage. A number of studies show that people who eat lots of fruits and vegetables that are rich in beta carotene and other vitamins and minerals have a lower risk of some cancers and heart disease. However, so far studies have not found that beta carotene supplements have the same health benefits as foods.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric eval-dataset test-dataset
cosine_accuracy 0.9937 0.9964

Training Details

Training Dataset

Unnamed Dataset

  • Size: 378,558 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string float
    details
    • min: 6 tokens
    • mean: 24.72 tokens
    • max: 147 tokens
    • min: 5 tokens
    • mean: 88.11 tokens
    • max: 512 tokens
    • min: 1.0
    • mean: 1.0
    • max: 1.0
  • Samples:
    sentence1 sentence2 label
    Does tolbutamide alter glucose transport and metabolism in the embryonic mouse heart? Tolbutamide stimulates glucose uptake and metabolism in the embryonic heart, as occurs in adult extra pancreatic tissues. Glut 1 and HKI, but not GRP78, are likely involved in tolbutamide induced cardiac dysmorphogenesis. 1.0
    Do flk1 cells derived from mouse embryonic stem cells reconstitute hematopoiesis in vivo in SCID mice? The Flk1 hematopoietic cells derived from ES cells reconstitute hematopoiesis in vivo and may become an alternative donor source for bone marrow transplantation. 1.0
    Does systematic aging of degradable nanosuspension ameliorate vibrating mesh nebulizer performance? Nebulization of purified nanosuspensions resulted in droplet diameters of 7.0 µm. However, electrolyte supplementation and storage, which led to an increase in sample conductivity 10 20 µS cm , were capable of providing smaller droplet diameters during vibrating mesh nebulization 5.0 µm . No relevant change of NP properties i.e. size, morphology, remaining mass and molecular weight of the employed polymer was observed when incubated at 22 C for two weeks. 1.0
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 47,320 evaluation samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string float
    details
    • min: 5 tokens
    • mean: 24.45 tokens
    • max: 253 tokens
    • min: 7 tokens
    • mean: 87.68 tokens
    • max: 512 tokens
    • min: 1.0
    • mean: 1.0
    • max: 1.0
  • Samples:
    sentence1 sentence2 label
    Does thrombospondin 2 gene silencing in human aortic smooth muscle cells improve cell attachment? siRNA mediated TSP 2 silencing of human aortic HAoSMCs improved cell attachment but had no effect on cell migration or proliferation. The effect on cell attachment was unrelated to changes in MMP activity. 1.0
    What can you do to manage polycythemia vera? Most people with polycythemia vera take low dose aspirin. There are a lot of ways you can keep yourself comfortable and as healthy as possible Don t smoke or chew tobacco. Tobacco makes blood vessels narrow, which can make blood clots more likely. Get some light exercise, such as walking, to help your circulation and keep your heart healthy. Do leg and ankle exercises to stop clots from forming in the veins of your legs. Your doctor or a physical therapist can show you how. Bathe or shower in cool water if warm water makes you itch. Keep your skin moist with lotion, and try not to scratch. 1.0
    Is weekly nab paclitaxel safe and effective in 65 years old patients with metastatic breast cancer a post hoc analysis? Weekly nab paclitaxel was safe and more efficacious compared with the q3w schedule and with solvent based taxanes in older patients with MBC. 1.0
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • do_predict: True
  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: True
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss eval-dataset_cosine_accuracy test-dataset_cosine_accuracy
0 0 - - 0.9813 -
0.0085 50 1.8471 - - -
0.0169 100 0.5244 - - -
0.0254 150 0.2175 - - -
0.0338 200 0.1392 - - -
0.0423 250 0.1437 - - -
0.0507 300 0.142 - - -
0.0592 350 0.1295 - - -
0.0676 400 0.1238 - - -
0.0761 450 0.14 - - -
0.0845 500 0.1173 0.1006 0.9931 -
0.0930 550 0.1236 - - -
0.1014 600 0.1127 - - -
0.1099 650 0.1338 - - -
0.1183 700 0.1071 - - -
0.1268 750 0.1149 - - -
0.1352 800 0.1072 - - -
0.1437 850 0.1117 - - -
0.1522 900 0.1087 - - -
0.1606 950 0.1242 - - -
0.1691 1000 0.1039 0.091 0.9965 -
0.1775 1050 0.1043 - - -
0.1860 1100 0.1193 - - -
0.1944 1150 0.1028 - - -
0.2029 1200 0.1027 - - -
0.2113 1250 0.1075 - - -
0.2198 1300 0.1177 - - -
0.2282 1350 0.0937 - - -
0.2367 1400 0.1095 - - -
0.2451 1450 0.1054 - - -
0.2536 1500 0.1003 0.0798 0.9958 -
0.2620 1550 0.0952 - - -
0.2705 1600 0.1028 - - -
0.2790 1650 0.0988 - - -
0.2874 1700 0.0887 - - -
0.2959 1750 0.1027 - - -
0.3043 1800 0.0937 - - -
0.3128 1850 0.1031 - - -
0.3212 1900 0.0857 - - -
0.3297 1950 0.094 - - -
0.3381 2000 0.1044 0.0721 0.9954 -
0.3466 2050 0.0829 - - -
0.3550 2100 0.0934 - - -
0.3635 2150 0.0785 - - -
0.3719 2200 0.0938 - - -
0.3804 2250 0.0885 - - -
0.3888 2300 0.0907 - - -
0.3973 2350 0.0911 - - -
0.4057 2400 0.0891 - - -
0.4142 2450 0.0798 - - -
0.4227 2500 0.0856 0.0655 0.9935 -
0.4311 2550 0.0925 - - -
0.4396 2600 0.0778 - - -
0.4480 2650 0.0871 - - -
0.4565 2700 0.0769 - - -
0.4649 2750 0.0815 - - -
0.4734 2800 0.0697 - - -
0.4818 2850 0.0714 - - -
0.4903 2900 0.0788 - - -
0.4987 2950 0.0772 - - -
0.5072 3000 0.0825 0.0618 0.9917 -
0.5156 3050 0.0742 - - -
0.5241 3100 0.0784 - - -
0.5325 3150 0.0697 - - -
0.5410 3200 0.0791 - - -
0.5495 3250 0.0657 - - -
0.5579 3300 0.0779 - - -
0.5664 3350 0.0719 - - -
0.5748 3400 0.0656 - - -
0.5833 3450 0.0698 - - -
0.5917 3500 0.0678 0.0578 0.9903 -
0.6002 3550 0.0771 - - -
0.6086 3600 0.0645 - - -
0.6171 3650 0.078 - - -
0.6255 3700 0.064 - - -
0.6340 3750 0.0691 - - -
0.6424 3800 0.0634 - - -
0.6509 3850 0.0732 - - -
0.6593 3900 0.059 - - -
0.6678 3950 0.0671 - - -
0.6762 4000 0.0633 0.0552 0.9936 -
0.6847 4050 0.0732 - - -
0.6932 4100 0.0593 - - -
0.7016 4150 0.0639 - - -
0.7101 4200 0.0672 - - -
0.7185 4250 0.0604 - - -
0.7270 4300 0.0666 - - -
0.7354 4350 0.0594 - - -
0.7439 4400 0.0783 - - -
0.7523 4450 0.0654 - - -
0.7608 4500 0.0596 0.0520 0.9937 -
0.7692 4550 0.0654 - - -
0.7777 4600 0.0511 - - -
0.7861 4650 0.0641 - - -
0.7946 4700 0.0609 - - -
0.8030 4750 0.0591 - - -
0.8115 4800 0.0496 - - -
0.8199 4850 0.0624 - - -
0.8284 4900 0.0639 - - -
0.8369 4950 0.056 - - -
0.8453 5000 0.0641 0.0487 0.9947 -
0.8538 5050 0.0608 - - -
0.8622 5100 0.0725 - - -
0.8707 5150 0.055 - - -
0.8791 5200 0.0556 - - -
0.8876 5250 0.0489 - - -
0.8960 5300 0.0513 - - -
0.9045 5350 0.0493 - - -
0.9129 5400 0.0574 - - -
0.9214 5450 0.0665 - - -
0.9298 5500 0.0588 0.0475 0.9937 -
0.9383 5550 0.0557 - - -
0.9467 5600 0.0497 - - -
0.9552 5650 0.0592 - - -
0.9637 5700 0.0526 - - -
0.9721 5750 0.0683 - - -
0.9806 5800 0.0588 - - -
0.9890 5850 0.0541 - - -
0.9975 5900 0.0636 - - -
1.0 5915 - - - 0.9964
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.10
  • Sentence Transformers: 3.3.0
  • Transformers: 4.46.2
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.1.1
  • Datasets: 3.1.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
22
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for shrijayan/medical-e5-base-v2-v0.1

Finetuned
(24)
this model

Evaluation results