SentenceTransformer based on BAAI/bge-m3

This is a sentence-transformers model finetuned from BAAI/bge-m3. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-m3
  • Maximum Sequence Length: 1024 tokens
  • Output Dimensionality: 1024 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("seongil-dn/bge-m3-kor-retrieval-451949-bs64-admin")
# Run inference
sentences = [
    '어떤 안건으로 제2차 그린철강위원회 관련 자동차협의회가 개최되었을까?',
    "제2차 그린철강위원회(6.18) 개최<br> 탄소중립 협의회 2차 회의 개최 경과 및 향후 일정 <br> <table><tbody><tr><td>시리즈</td><td>일정</td><td>협의회</td><td>주요 내용</td><td>구분</td></tr><tr><td> </td><td>6.2</td><td>정유</td><td>정유업계 탄소중립 기술개발 로드맵 추진방향 모색</td><td rowspan='2'>旣개최 </td></tr><tr><td> </td><td>6.15</td><td>석유화학</td><td> 석유화학 분야 2050 탄소중립을 위한 예타 R&D 기획·추진 현황</td></tr><tr><td> </td><td>6.18</td><td>철강</td><td>철강 분야 2050 감축시나리오 수립 동향, 탄소중립 R&D 로드맵 등</td><td>개최</td></tr><tr><td> </td><td>6.23</td><td>표준화</td><td>탄소중립 표준화 전략 추진현황 점검</td><td rowspan='11'>개최 예정 </td></tr><tr><td> </td><td>6월말</td><td>반도체 디스플레이 </td><td>반도체·디스플레이 탄소중립 R&D 로드맵 동향 및 탄소중립 방향성 논의</td></tr><tr><td> </td><td>6월말</td><td>섬유‧제지</td><td>섬유ㆍ제지산업 탄소중립 R&D전략 논의</td></tr><tr><td> </td><td>6월말</td><td>기계</td><td>기계산업 탄소중립 추진전략 논의(잠정)</td></tr><tr><td> </td><td>7월초</td><td>기술혁신</td><td>‘2050 탄소중립 R&D 전략’ 추진현황 논의</td></tr><tr><td> </td><td>7월초</td><td>자동차</td><td>자동차 2050 감축시나리오 수립 동향 및 탄소중립 로드맵 추진 현황</td></tr><tr><td> </td><td>7.1</td><td>조선</td><td>조선업 탄소중립 실현방안(잠정)</td></tr><tr><td> </td><td>7.2</td><td>바이오</td><td>협의체 운영방안 관련 주요 업계 간담회</td></tr><tr><td> </td><td>7.2</td><td>전기전자</td><td>전기전자 탄소중립 R&D 전략 논의 등 </td></tr><tr><td> </td><td>7.9</td><td>비철금속</td><td>비철금속업계 단기 온실가스 감축방안 논의 및 혁신사례 공유</td></tr><tr><td> </td><td>7.15</td><td>시멘트</td><td>시멘트산업 탄소중립 R&D 로드맵 및 탄소중립을 위한 제도개선 과제 마련</td></tr></tbody></table>",
    '제1차 녹색성장 이행점검회의 개최\n□ 김황식 국무총리는 9.7(수) 15:00 정부중앙청사에서 \uf000제1차 녹색성장 이행점검회의\uf000를 개최하여,\nㅇ ‘공공건축 에너지효율 향상’과 ‘그린카 산업발전 전략’ 등 두 건에 대한 이행점검결과를 보고받고 보완대책을 논의하였음 * 그린카는 에너지 소비 효율이 우수하고 무공해․저공해 자동차로서 ① 전력을 기반으로 하는 전기차, 연료전지차, ② 엔진을 기반으로 하는 하이브리드차, 클린디젤차 등을 의미\n□ 김총리는 이 자리에서 앞으로 매달 총리가 주재하는 관계장관회의를 통해 녹색성장 정책에 대한 이행실적을 점검하고,\nㅇ 그간 각 부처가 발표했던 주요 녹색성장정책들이 제대로 추진되고 있는지, 문제점이 있다면 그 이유가 무엇이며, 어떻게 해결해야 하는지 현실성 있는 해결방안을 마련해 나갈 계획임을 밝혔음 □ 김총리는 그동안 녹색성장 정책이 주로 계획수립 및 제도개선 과제에 집중하다보니, 상대적으로 집행단계에서 다소 미흡한 점이 있었다고 평가하며\nㅇ 녹색성장이 올바로 뿌리내릴 수 있도록 향후 중점 추진해야 할 핵심과제를 발굴하여 정책역량을 집중하고,\nㅇ 이를 통해 “국민에게 보고 드린 정책들은 반드시 제대로 추진해서 신뢰받는 정부가 되도록 해 달라”고 당부하였음 □ 녹색성장 정책 이행점검을 위해 녹색위는 그동안 관계부처, 민간전문가, 업계와 공동으로 점검을 실시하였으며, 점검결과는 다음과 같음 < ‘공공건축 에너지 효율’ 이행상황 점검 결과 >\n□ ‘08.8월 이후 국토부, 지경부, 환경부 등 7개 부처가 추진중인 3개분야 11개 정책을 점검한 결과\nㅇ 점검결과 신축청사의 에너지효율 기준 강화 등 제도개선과제는 정상추진\nㅇ 그린스쿨, 저탄소 녹색마을 등 실제 집행단계에 있는 과제는 보완이 필요',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 64
  • learning_rate: 3e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.05
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 3e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.05
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
0.0057 1 0.8443
0.0114 2 0.9439
0.0171 3 0.8336
0.0229 4 0.7631
0.0286 5 0.7086
0.0343 6 0.6314
0.04 7 0.6318
0.0457 8 0.5864
0.0514 9 0.5219
0.0571 10 0.4932
0.0629 11 0.4067
0.0686 12 0.4542
0.0743 13 0.4086
0.08 14 0.4072
0.0857 15 0.3854
0.0914 16 0.3505
0.0971 17 0.3431
0.1029 18 0.3668
0.1086 19 0.3393
0.1143 20 0.31
0.12 21 0.3047
0.1257 22 0.3301
0.1314 23 0.2967
0.1371 24 0.3098
0.1429 25 0.2867
0.1486 26 0.2518
0.1543 27 0.2606
0.16 28 0.257
0.1657 29 0.2326
0.1714 30 0.2829
0.1771 31 0.3034
0.1829 32 0.2568
0.1886 33 0.2776
0.1943 34 0.298
0.2 35 0.2521
0.2057 36 0.2924
0.2114 37 0.2755
0.2171 38 0.2314
0.2229 39 0.2736
0.2286 40 0.2297
0.2343 41 0.2403
0.24 42 0.2805
0.2457 43 0.2348
0.2514 44 0.2064
0.2571 45 0.2227
0.2629 46 0.2062
0.2686 47 0.2666
0.2743 48 0.2183
0.28 49 0.2266
0.2857 50 0.2131
0.2914 51 0.2483
0.2971 52 0.2475
0.3029 53 0.2533
0.3086 54 0.2199
0.3143 55 0.2045
0.32 56 0.1937
0.3257 57 0.2144
0.3314 58 0.1842
0.3371 59 0.2374
0.3429 60 0.233
0.3486 61 0.2002
0.3543 62 0.1788
0.36 63 0.2128
0.3657 64 0.1996
0.3714 65 0.2241
0.3771 66 0.228
0.3829 67 0.2568
0.3886 68 0.2063
0.3943 69 0.1848
0.4 70 0.1842
0.4057 71 0.2318
0.4114 72 0.1968
0.4171 73 0.2032
0.4229 74 0.1883
0.4286 75 0.2148
0.4343 76 0.2275
0.44 77 0.2058
0.4457 78 0.2104
0.4514 79 0.2039
0.4571 80 0.1903
0.4629 81 0.1957
0.4686 82 0.2121
0.4743 83 0.1729
0.48 84 0.2159
0.4857 85 0.2048
0.4914 86 0.1755
0.4971 87 0.2023
0.5029 88 0.1851
0.5086 89 0.2018
0.5143 90 0.2199
0.52 91 0.2263
0.5257 92 0.1967
0.5314 93 0.2174
0.5371 94 0.2075
0.5429 95 0.1963
0.5486 96 0.1926
0.5543 97 0.185
0.56 98 0.2089
0.5657 99 0.1786
0.5714 100 0.2075
0.5771 101 0.205
0.5829 102 0.1526
0.5886 103 0.1909
0.5943 104 0.2004
0.6 105 0.1909
0.6057 106 0.2113
0.6114 107 0.2221
0.6171 108 0.2
0.6229 109 0.2164
0.6286 110 0.1656
0.6343 111 0.2221
0.64 112 0.2046
0.6457 113 0.1626
0.6514 114 0.1851
0.6571 115 0.1822
0.6629 116 0.1781
0.6686 117 0.1875
0.6743 118 0.1967
0.68 119 0.2009
0.6857 120 0.2092
0.6914 121 0.1781
0.6971 122 0.2149
0.7029 123 0.2409
0.7086 124 0.2073
0.7143 125 0.1851
0.72 126 0.1824
0.7257 127 0.1767
0.7314 128 0.2187
0.7371 129 0.2224
0.7429 130 0.195
0.7486 131 0.1558
0.7543 132 0.1979
0.76 133 0.1692
0.7657 134 0.1811
0.7714 135 0.199
0.7771 136 0.2137
0.7829 137 0.1704
0.7886 138 0.1829
0.7943 139 0.2346
0.8 140 0.1784
0.8057 141 0.1899
0.8114 142 0.1517
0.8171 143 0.168
0.8229 144 0.2025
0.8286 145 0.1685
0.8343 146 0.1825
0.84 147 0.2095
0.8457 148 0.2027
0.8514 149 0.1973
0.8571 150 0.1875
0.8629 151 0.2079
0.8686 152 0.1789
0.8743 153 0.1714
0.88 154 0.183
0.8857 155 0.1718
0.8914 156 0.1899
0.8971 157 0.1916
0.9029 158 0.1941
0.9086 159 0.1987
0.9143 160 0.1421
0.92 161 0.1598
0.9257 162 0.1596
0.9314 163 0.1801
0.9371 164 0.1595
0.9429 165 0.1983
0.9486 166 0.2002
0.9543 167 0.2045
0.96 168 0.167
0.9657 169 0.2106
0.9714 170 0.19
0.9771 171 0.1717
0.9829 172 0.1899
0.9886 173 0.1596
0.9943 174 0.1863
1.0 175 0.1969

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.2.1
  • Transformers: 4.44.2
  • PyTorch: 2.3.1+cu121
  • Accelerate: 1.1.1
  • Datasets: 2.21.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
Downloads last month
5
Safetensors
Model size
568M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for seongil-dn/bge-m3-kor-retrieval-451949-bs64-admin

Base model

BAAI/bge-m3
Finetuned
(191)
this model