intfloat-fine-tuned

This is a sentence-transformers model finetuned from intfloat/multilingual-e5-large-instruct on the json dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: intfloat/multilingual-e5-large-instruct
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 tokens
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json
  • Language: tr
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Omerhan/checkpoint-4686-v7")
# Run inference
sentences = [
    'ret kuyruğu nedir',
    "Bir kuyruktan gelen mesajlar 'ölü harfli' olabilir; yani, aşağıdaki olaylardan herhangi biri meydana geldiğinde başka bir değiş tokuşa yeniden yayınlanabilir: 1 İleti, requeue=false ile (basic.reject veya basic.nack) reddedilir, 2 İletinin TTL'si sona erer; veya. 3 Kuyruk uzunluğu sınırı aşılır.",
    "2.'reddetmek'. Bir fikir veya inançla aynı fikirde değilseniz,'reddetmek' demiyorsunuz. Bunu reddettiğinizi söylüyorsunuz. Bazı insanlar karma ekonomi fikrini reddediyor. Ailemin dini inançlarını reddetmek benim için zordu. 3. İsim olarak kullanılır. Reddetmek, attığınız şeylere atıfta bulunmak için kullanılan bir isimdir.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 920,106 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 4 tokens
    • mean: 10.38 tokens
    • max: 39 tokens
    • min: 26 tokens
    • mean: 81.21 tokens
    • max: 149 tokens
    • min: 4 tokens
    • mean: 78.05 tokens
    • max: 133 tokens
  • Samples:
    anchor positive negative
    Avustralya'ya özgü hangi meyve Passiflora herbertiana. Avustralya'ya özgü nadir bir tutku meyvesi. Meyveler yeşil tenli, beyaz etli, bilinmeyen bir yenilebilir derecelendirmeye sahiptir. Bazı kaynaklar meyveyi yenilebilir, tatlı ve lezzetli olarak listelerken, diğerleri meyveleri acı ve yenemez olarak listeler. Avustralya'ya özgü nadir bir tutku meyvesi. Meyveler yeşil tenli, beyaz etli, bilinmeyen yenilebilir bir derecelendirmeye sahip. Bazı kaynaklar meyveyi tatlı olarak listeler. Kola cevizi, Afrika'nın tropikal yağmur ormanlarına özgü bir ağaç cinsidir (Cola).
    meyve ağaçları türleri Kiraz. Kiraz ağaçları dünya çapında bulunur. Kirazdan siyah kiraza kadar değişen 40 veya daha fazla çeşit vardır. Meyve ile birlikte, kiraz ağaçları, son derece hoş kokulu hafif ve narin pembemsi-beyaz çiçekler üretir.Omments. Submit. Mülkünüze meyve ağaçları dikmek sadece size istikrarlı bir organik meyve kaynağı sağlamakla kalmaz, aynı zamanda bahçenizi güzelleştirmenizi ve oksijeni çevreye geri vermenizi sağlar. Kola cevizi, Afrika'nın tropikal yağmur ormanlarına özgü bir ağaç cinsidir (Cola).
    Harrison City Pa nerede yaşıyor? Harrison City, Amerika Birleşik Devletleri'nin Pensilvanya eyaletinde yer alan Westmoreland County'de nüfus sayımına göre belirlenmiş bir yerdir. 2000 nüfus sayımında nüfus 155'tir. En yakın şehirler: Vandling borough, PA (1.1 mil ), Simpson, PA (2.0 mil ), Union Dale borough, PA (2,1 mil ), Carbondale, PA (2,4 mil ), Waymart borough, PA (2,4 mil ), Mayfield borough, PA (2.9 mil ), Prompion borough, PA (2.9 mil ), Jermyn borough, PA (3.1 mil ).
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            1024
        ],
        "matryoshka_weights": [
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • gradient_accumulation_steps: 8
  • learning_rate: 5e-06
  • num_train_epochs: 1
  • lr_scheduler_type: cosine
  • tf32: True
  • optim: adamw_torch_fused
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 8
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-06
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: True
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss
0.0348 500 0.1492
0.0696 1000 0.1114
0.1043 1500 0.1013
0.1391 2000 0.0988
0.1739 2500 0.0973
0.2087 3000 0.0909
0.2434 3500 0.0858
0.2782 4000 0.0899
0.3130 4500 0.0861

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.1.1
  • Transformers: 4.45.2
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
8
Safetensors
Model size
560M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Omerhan/checkpoint-4686-v7

Finetuned
(61)
this model