fine tune with openfda

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 tokens
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("kivanc/test")
# Run inference
sentences = [
    'The pH range of TissueBlue 0.025% Solution is between 7.3 and 7.6.',
    'The pH range of the solution is 4.5 to 7.5.',
    'I.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 1.0
dot_accuracy 0.0
manhattan_accuracy 1.0
euclidean_accuracy 1.0
max_accuracy 1.0

Triplet

Metric Value
cosine_accuracy 1.0
dot_accuracy 0.0
manhattan_accuracy 1.0
euclidean_accuracy 1.0
max_accuracy 1.0

Training Details

Training Dataset

Unnamed Dataset

  • Size: 5,344 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 3 tokens
    • mean: 41.73 tokens
    • max: 255 tokens
    • min: 3 tokens
    • mean: 42.27 tokens
    • max: 256 tokens
    • min: 3 tokens
    • mean: 24.11 tokens
    • max: 157 tokens
  • Samples:
    anchor positive negative
    Table 6– Physical Decay Chart for Technetium-99m, Half-Life 6.02 Hours Hours Fraction Remaining Hours Fraction Remaining 0* 1.000 7 0.447 1 0.891 8 0.398 2 0.794 9 0.355 3 0.708 10 0.316 4 0.631 11 0.282 5 0.562 12 0.251 6 0.501 *Calibration Time Table 9 Physical Decay Chart of Technetium 99m Tc, Half Life: 6 Hours *Calibration Time Hours Fraction Remaining Hours Fraction Remaining 0 * 1.000 5 0.562 1 0.891 6 0.501 2 0.794 8 0.398 3 0.708 10 0.316 4 0.631 12 0.251 -Gently massage into affected areas.
    The compound has the empirical formula C 43 H 68 ClNO 11 and the molecular weight of 810.47. Its molecular formula is C 17 H 16 ClNO⋅C 4 H 4 O 4 and its molecular weight is 401.84 (free base: 285.8). Intravesical instillation for the treatment of interstitial cystitis.
    Adempas 1, 1.5, 2 and 2.5 mg tablets contain, in addition, ferric oxide yellow. Adempas 2 and 2.5 mg tablets contain, in addition, ferric oxide red. Higher temperatures lead to greater losses.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 1,336 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 3 tokens
    • mean: 38.62 tokens
    • max: 256 tokens
    • min: 3 tokens
    • mean: 38.35 tokens
    • max: 256 tokens
    • min: 3 tokens
    • mean: 24.91 tokens
    • max: 189 tokens
  • Samples:
    anchor positive negative
    Sacituzumab govitecan-hziy contains on average 7 to 8 molecules of SN-38 per antibody molecule. An average of 2.3 molecules of SG3249 are attached to each antibody molecule. Over this time period, blood pressure returns gradually to pretreatment levels.
    11 DESCRIPTION INVOKANA ® (canagliflozin) contains canagliflozin, an inhibitor of SGLT2, the transporter responsible for reabsorbing the majority of glucose filtered by the kidney. Canagliflozin, the active ingredient of INVOKANA, is chemically known as (1 S )-1,5-anhydro-1-[3-[[5-(4-fluorophenyl)-2-thienyl]methyl]-4-methylphenyl]-D-glucitol hemihydrate and its molecular formula and weight are C 24 H 25 FO 5 S∙1/2 H 2 O and 453.53, respectively. 1 Evaluated Nuclear Structure Data File of the Oak Ridge Nuclear Data Project DOE (1985).
    Its molecular formula is C 25 H 37 NO 4 . Its molecular formula is C 27 H 41 NO 8 . GOOD LENS CARE PRACTICES: ☞ Always wash and rinse your hands before you handle your lenses.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 10
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss dev-eva_max_accuracy test-eva_max_accuracy
0 0 - - 1.0 -
0.2994 100 0.0288 0.0080 1.0 -
0.5988 200 0.0297 0.0089 1.0 -
0.8982 300 0.0283 0.0103 1.0 -
1.1976 400 0.0021 0.0111 1.0 -
1.4970 500 0.0008 0.0137 1.0 -
1.7964 600 0.0 0.0137 1.0 -
1.2754 700 0.0198 0.0109 1.0 -
1.5749 800 0.0239 0.0165 1.0 -
1.8743 900 0.0118 0.0133 1.0 -
2.1737 1000 0.0012 0.0117 1.0 -
2.4731 1100 0.0001 0.0116 1.0 -
2.7725 1200 0.0 0.0116 1.0 -
2.2515 1300 0.0041 0.0120 1.0 -
2.5509 1400 0.0063 0.0102 1.0 -
2.8503 1500 0.0039 0.0154 1.0 -
3.1497 1600 0.0008 0.0113 1.0 -
3.4491 1700 0.0 0.0110 1.0 -
3.7485 1800 0.0 0.0110 1.0 -
3.2275 1900 0.0017 0.0122 1.0 -
3.5269 2000 0.0023 0.0119 1.0 -
3.8263 2100 0.0019 0.0123 1.0 -
4.1257 2200 0.0006 0.0125 1.0 -
4.4251 2300 0.0 0.0120 1.0 -
4.7246 2400 0.0 0.0120 1.0 -
4.2036 2500 0.0009 0.0125 1.0 -
4.5030 2600 0.0012 0.0115 1.0 -
4.8024 2700 0.0013 0.0125 1.0 -
5.1018 2800 0.0004 0.0120 1.0 -
5.4012 2900 0.0 0.0118 1.0 -
5.7006 3000 0.0 0.0118 1.0 -
5.1796 3100 0.0006 0.0120 1.0 -
5.4790 3200 0.001 0.0118 1.0 -
5.7784 3300 0.001 0.0118 1.0 -
5.8982 3340 - - - 1.0

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.1.1
  • Transformers: 4.45.2
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.1.1
  • Datasets: 3.1.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
8
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kivanc/test

Finetuned
(187)
this model

Evaluation results