MohamedAshraf701's picture
Update README.md
3ca219f verified
metadata
base_model: sentence-transformers/multi-qa-MiniLM-L6-cos-v1
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:44072
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: Men
    sentences:
      - Casual
      - Spring
      - Navy Blue
      - Carlton London Men Navy Blue Shoes
      - Footwear
      - Casual Shoes
      - Shoes
  - source_sentence: Men
    sentences:
      - Winter
      - Black
      - Casual
      - Accessories
      - United Colors of Benetton Men Black Sunglasses
      - Eyewear
      - Sunglasses
  - source_sentence: Women
    sentences:
      - Casual Shoes
      - Purple
      - Casual
      - Footwear
      - Summer
      - ADIDAS Neo Women Renewal Purple Shoes
      - Shoes
  - source_sentence: Men
    sentences:
      - Wallets
      - Summer
      - Accessories
      - Brown
      - Formal
      - Peter England Men Statements Brown Wallet
      - Wallets
  - source_sentence: Men
    sentences:
      - Yellow
      - Apparel
      - Topwear
      - Peter England Men Stripes Yellow Polo T-Shirt
      - Tshirts
      - Fall
      - Casual
license: mit
datasets:
  - MohamedAshraf701/Products-Details
language:
  - en
new_version: MohamedAshraf701/multi-qa-MiniLM-L6-cos-v1-products

SentenceTransformer based on MohamedAshraf701/multi-qa-MiniLM-L6-cos-v1-products

This is a sentence-transformers model finetuned from sentence-transformers/multi-qa-MiniLM-L6-cos-v1. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("MohamedAshraf701/multi-qa-MiniLM-L6-cos-v1-products")
# Run inference
sentences = [
    'Men',
    'Apparel',
    'Topwear',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 44,072 training samples
  • Columns: gender, masterCategory, subCategory, articleType, baseColour, season, usage, and productDisplayName
  • Approximate statistics based on the first 1000 samples:
    gender masterCategory subCategory articleType baseColour season usage productDisplayName
    type string string string string string string string string
    details
    • min: 3 tokens
    • mean: 3.1 tokens
    • max: 5 tokens
    • min: 3 tokens
    • mean: 3.26 tokens
    • max: 4 tokens
    • min: 3 tokens
    • mean: 3.62 tokens
    • max: 7 tokens
    • min: 3 tokens
    • mean: 3.9 tokens
    • max: 7 tokens
    • min: 3 tokens
    • mean: 3.08 tokens
    • max: 5 tokens
    • min: 3 tokens
    • mean: 3.0 tokens
    • max: 3 tokens
    • min: 3 tokens
    • mean: 3.0 tokens
    • max: 3 tokens
    • min: 6 tokens
    • mean: 10.13 tokens
    • max: 28 tokens
  • Samples:
    gender masterCategory subCategory articleType baseColour season usage productDisplayName
    Women Footwear Shoes Heels Gold Summer Casual Enroute Women Gold Flats
    Men Accessories Belts Belts Black Fall Casual Wrangler Textured Men Black Belts
    Men Footwear Shoes Sports Shoes Grey Fall Sports Nike Men Air Max+ 2011 Grey Sports Shoes
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • num_train_epochs: 20
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 128
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 20
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss
1.4493 500 5.1318
2.8986 1000 4.7978
4.3478 1500 4.7906
5.7971 2000 4.7948
7.2464 2500 4.7897
8.6957 3000 4.7936
10.1449 3500 4.789
11.5942 4000 4.7916
13.0435 4500 4.7887
14.4928 5000 4.7903
15.9420 5500 4.791
17.3913 6000 4.788
18.8406 6500 4.7909

Framework Versions

  • Python: 3.12.6
  • Sentence Transformers: 3.1.1
  • Transformers: 4.45.1
  • PyTorch: 2.4.1
  • Accelerate: 0.34.2
  • Datasets: 3.0.1
  • Tokenizers: 0.20.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}