SentenceTransformer based on neuralmind/bert-large-portuguese-cased

This is a sentence-transformers model finetuned from neuralmind/bert-large-portuguese-cased. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: neuralmind/bert-large-portuguese-cased
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("SenhorDasMoscas/bert-ptbr-e3-lr0.0001-04-01-2025")
# Run inference
sentences = [
    'cobertor pelucia',
    'moda acessorio',
    'servico reparo eletronico',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.9058
spearman_cosine 0.8399

Training Details

Training Dataset

Unnamed Dataset

  • Size: 18,623 training samples
  • Columns: text1, text2, and label
  • Approximate statistics based on the first 1000 samples:
    text1 text2 label
    type string string float
    details
    • min: 3 tokens
    • mean: 7.67 tokens
    • max: 17 tokens
    • min: 3 tokens
    • mean: 6.58 tokens
    • max: 11 tokens
    • min: 0.1
    • mean: 0.54
    • max: 1.0
  • Samples:
    text1 text2 label
    tabua carne casa decoracao 1.0
    caminhaor basculante brinquedo brinquedo jogo educativo 1.0
    buscar mochila escolar crianca comida rapido fastfood 0.1
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 2,070 evaluation samples
  • Columns: text1, text2, and label
  • Approximate statistics based on the first 1000 samples:
    text1 text2 label
    type string string float
    details
    • min: 3 tokens
    • mean: 7.69 tokens
    • max: 17 tokens
    • min: 3 tokens
    • mean: 6.54 tokens
    • max: 11 tokens
    • min: 0.1
    • mean: 0.59
    • max: 1.0
  • Samples:
    text1 text2 label
    preciso pao frances integral padaria confeitaria 1.0
    onde poder comprar microfone joia bijuterio 0.1
    chuveiro eletrico lorenzetti livro material literario 0.1
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 0.0001
  • weight_decay: 0.1
  • warmup_ratio: 0.1
  • warmup_steps: 232
  • fp16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 0.0001
  • weight_decay: 0.1
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 232
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss eval-similarity_spearman_cosine
0.0086 5 0.2031 - -
0.0172 10 0.2078 - -
0.0258 15 0.2062 - -
0.0344 20 0.1693 - -
0.0430 25 0.1681 - -
0.0515 30 0.1639 - -
0.0601 35 0.1393 - -
0.0687 40 0.1675 - -
0.0773 45 0.1297 - -
0.0859 50 0.1223 - -
0.0945 55 0.1203 - -
0.1031 60 0.0942 - -
0.1117 65 0.0922 - -
0.1203 70 0.097 - -
0.1289 75 0.0927 - -
0.1375 80 0.0961 - -
0.1460 85 0.0821 - -
0.1546 90 0.0621 - -
0.1632 95 0.084 - -
0.1718 100 0.0706 - -
0.1804 105 0.0701 - -
0.1890 110 0.0828 - -
0.1976 115 0.078 - -
0.2062 120 0.0745 - -
0.2148 125 0.0744 - -
0.2234 130 0.0785 - -
0.2320 135 0.0745 - -
0.2405 140 0.0615 - -
0.2491 145 0.0665 - -
0.2577 150 0.0873 - -
0.2663 155 0.0916 - -
0.2749 160 0.0659 - -
0.2835 165 0.0896 - -
0.2921 170 0.0807 - -
0.3007 175 0.0745 - -
0.3093 180 0.0794 - -
0.3179 185 0.0703 - -
0.3265 190 0.0705 - -
0.3351 195 0.084 - -
0.3436 200 0.0671 - -
0.3522 205 0.076 - -
0.3608 210 0.0821 - -
0.3694 215 0.0499 - -
0.3780 220 0.0729 - -
0.3866 225 0.0697 - -
0.3952 230 0.085 - -
0.4038 235 0.0835 - -
0.4124 240 0.0743 - -
0.4210 245 0.0714 - -
0.4296 250 0.0597 - -
0.4381 255 0.0626 - -
0.4467 260 0.0522 - -
0.4553 265 0.0734 - -
0.4639 270 0.0616 - -
0.4725 275 0.0463 - -
0.4811 280 0.0631 - -
0.4897 285 0.0672 - -
0.4983 290 0.0725 - -
0.5069 295 0.043 - -
0.5155 300 0.0675 0.0698 0.7861
0.5241 305 0.0837 - -
0.5326 310 0.0785 - -
0.5412 315 0.0761 - -
0.5498 320 0.0523 - -
0.5584 325 0.0514 - -
0.5670 330 0.0726 - -
0.5756 335 0.0584 - -
0.5842 340 0.0736 - -
0.5928 345 0.0705 - -
0.6014 350 0.0682 - -
0.6100 355 0.0636 - -
0.6186 360 0.0484 - -
0.6271 365 0.0524 - -
0.6357 370 0.0657 - -
0.6443 375 0.0766 - -
0.6529 380 0.0759 - -
0.6615 385 0.071 - -
0.6701 390 0.055 - -
0.6787 395 0.0466 - -
0.6873 400 0.0697 - -
0.6959 405 0.0546 - -
0.7045 410 0.0692 - -
0.7131 415 0.0519 - -
0.7216 420 0.0521 - -
0.7302 425 0.0449 - -
0.7388 430 0.0646 - -
0.7474 435 0.0585 - -
0.7560 440 0.0536 - -
0.7646 445 0.0592 - -
0.7732 450 0.0515 - -
0.7818 455 0.0676 - -
0.7904 460 0.0732 - -
0.7990 465 0.0618 - -
0.8076 470 0.0579 - -
0.8162 475 0.0516 - -
0.8247 480 0.0659 - -
0.8333 485 0.0583 - -
0.8419 490 0.0624 - -
0.8505 495 0.0667 - -
0.8591 500 0.052 - -
0.8677 505 0.0858 - -
0.8763 510 0.0441 - -
0.8849 515 0.0592 - -
0.8935 520 0.0532 - -
0.9021 525 0.0478 - -
0.9107 530 0.062 - -
0.9192 535 0.0487 - -
0.9278 540 0.0704 - -
0.9364 545 0.0467 - -
0.9450 550 0.0482 - -
0.9536 555 0.0796 - -
0.9622 560 0.0568 - -
0.9708 565 0.0588 - -
0.9794 570 0.0514 - -
0.9880 575 0.0543 - -
0.9966 580 0.0568 - -
1.0052 585 0.0513 - -
1.0137 590 0.0361 - -
1.0223 595 0.0405 - -
1.0309 600 0.0347 0.0491 0.8180
1.0395 605 0.0459 - -
1.0481 610 0.0557 - -
1.0567 615 0.0447 - -
1.0653 620 0.0279 - -
1.0739 625 0.0417 - -
1.0825 630 0.025 - -
1.0911 635 0.0399 - -
1.0997 640 0.0466 - -
1.1082 645 0.0294 - -
1.1168 650 0.035 - -
1.1254 655 0.0376 - -
1.1340 660 0.0414 - -
1.1426 665 0.0502 - -
1.1512 670 0.04 - -
1.1598 675 0.0385 - -
1.1684 680 0.0286 - -
1.1770 685 0.0361 - -
1.1856 690 0.0282 - -
1.1942 695 0.0473 - -
1.2027 700 0.0346 - -
1.2113 705 0.0295 - -
1.2199 710 0.0283 - -
1.2285 715 0.0301 - -
1.2371 720 0.0565 - -
1.2457 725 0.0325 - -
1.2543 730 0.0299 - -
1.2629 735 0.0417 - -
1.2715 740 0.0398 - -
1.2801 745 0.0477 - -
1.2887 750 0.0418 - -
1.2973 755 0.034 - -
1.3058 760 0.0397 - -
1.3144 765 0.0308 - -
1.3230 770 0.0457 - -
1.3316 775 0.0328 - -
1.3402 780 0.0222 - -
1.3488 785 0.0246 - -
1.3574 790 0.0229 - -
1.3660 795 0.0351 - -
1.3746 800 0.0415 - -
1.3832 805 0.0351 - -
1.3918 810 0.0269 - -
1.4003 815 0.0307 - -
1.4089 820 0.0381 - -
1.4175 825 0.0425 - -
1.4261 830 0.0557 - -
1.4347 835 0.0523 - -
1.4433 840 0.0488 - -
1.4519 845 0.0355 - -
1.4605 850 0.0403 - -
1.4691 855 0.0332 - -
1.4777 860 0.0427 - -
1.4863 865 0.0348 - -
1.4948 870 0.0375 - -
1.5034 875 0.0271 - -
1.5120 880 0.0428 - -
1.5206 885 0.0666 - -
1.5292 890 0.0491 - -
1.5378 895 0.0424 - -
1.5464 900 0.0413 0.0418 0.8326
1.5550 905 0.0469 - -
1.5636 910 0.0288 - -
1.5722 915 0.0541 - -
1.5808 920 0.017 - -
1.5893 925 0.0505 - -
1.5979 930 0.0341 - -
1.6065 935 0.0223 - -
1.6151 940 0.0469 - -
1.6237 945 0.0386 - -
1.6323 950 0.0214 - -
1.6409 955 0.0329 - -
1.6495 960 0.0398 - -
1.6581 965 0.0355 - -
1.6667 970 0.0373 - -
1.6753 975 0.0339 - -
1.6838 980 0.0349 - -
1.6924 985 0.0439 - -
1.7010 990 0.0425 - -
1.7096 995 0.0318 - -
1.7182 1000 0.025 - -
1.7268 1005 0.0334 - -
1.7354 1010 0.0327 - -
1.7440 1015 0.0356 - -
1.7526 1020 0.0428 - -
1.7612 1025 0.0432 - -
1.7698 1030 0.0334 - -
1.7784 1035 0.032 - -
1.7869 1040 0.0318 - -
1.7955 1045 0.0281 - -
1.8041 1050 0.0231 - -
1.8127 1055 0.0436 - -
1.8213 1060 0.0303 - -
1.8299 1065 0.0489 - -
1.8385 1070 0.0292 - -
1.8471 1075 0.06 - -
1.8557 1080 0.0329 - -
1.8643 1085 0.0322 - -
1.8729 1090 0.0426 - -
1.8814 1095 0.0263 - -
1.8900 1100 0.024 - -
1.8986 1105 0.0228 - -
1.9072 1110 0.0313 - -
1.9158 1115 0.044 - -
1.9244 1120 0.036 - -
1.9330 1125 0.0252 - -
1.9416 1130 0.0311 - -
1.9502 1135 0.0452 - -
1.9588 1140 0.0338 - -
1.9674 1145 0.0447 - -
1.9759 1150 0.0318 - -
1.9845 1155 0.0428 - -
1.9931 1160 0.03 - -
2.0017 1165 0.0314 - -
2.0103 1170 0.0181 - -
2.0189 1175 0.0137 - -
2.0275 1180 0.0242 - -
2.0361 1185 0.03 - -
2.0447 1190 0.0267 - -
2.0533 1195 0.0263 - -
2.0619 1200 0.0219 0.0392 0.8360
2.0704 1205 0.0189 - -
2.0790 1210 0.0193 - -
2.0876 1215 0.0345 - -
2.0962 1220 0.0136 - -
2.1048 1225 0.0346 - -
2.1134 1230 0.0163 - -
2.1220 1235 0.0264 - -
2.1306 1240 0.0172 - -
2.1392 1245 0.0163 - -
2.1478 1250 0.0226 - -
2.1564 1255 0.0229 - -
2.1649 1260 0.0185 - -
2.1735 1265 0.0134 - -
2.1821 1270 0.0144 - -
2.1907 1275 0.0215 - -
2.1993 1280 0.0291 - -
2.2079 1285 0.0305 - -
2.2165 1290 0.0192 - -
2.2251 1295 0.0272 - -
2.2337 1300 0.0267 - -
2.2423 1305 0.0265 - -
2.2509 1310 0.0207 - -
2.2595 1315 0.0305 - -
2.2680 1320 0.0292 - -
2.2766 1325 0.017 - -
2.2852 1330 0.0242 - -
2.2938 1335 0.016 - -
2.3024 1340 0.0241 - -
2.3110 1345 0.0193 - -
2.3196 1350 0.0134 - -
2.3282 1355 0.0206 - -
2.3368 1360 0.0218 - -
2.3454 1365 0.0239 - -
2.3540 1370 0.0314 - -
2.3625 1375 0.028 - -
2.3711 1380 0.021 - -
2.3797 1385 0.0179 - -
2.3883 1390 0.0173 - -
2.3969 1395 0.0228 - -
2.4055 1400 0.0217 - -
2.4141 1405 0.0243 - -
2.4227 1410 0.018 - -
2.4313 1415 0.0233 - -
2.4399 1420 0.016 - -
2.4485 1425 0.0308 - -
2.4570 1430 0.0239 - -
2.4656 1435 0.018 - -
2.4742 1440 0.016 - -
2.4828 1445 0.0189 - -
2.4914 1450 0.0215 - -
2.5 1455 0.027 - -
2.5086 1460 0.0177 - -
2.5172 1465 0.0325 - -
2.5258 1470 0.0136 - -
2.5344 1475 0.0235 - -
2.5430 1480 0.0362 - -
2.5515 1485 0.0302 - -
2.5601 1490 0.0137 - -
2.5687 1495 0.0162 - -
2.5773 1500 0.0174 0.0376 0.8399
2.5859 1505 0.0248 - -
2.5945 1510 0.0131 - -
2.6031 1515 0.0188 - -
2.6117 1520 0.011 - -
2.6203 1525 0.0174 - -
2.6289 1530 0.0192 - -
2.6375 1535 0.0113 - -
2.6460 1540 0.0304 - -
2.6546 1545 0.0217 - -
2.6632 1550 0.0102 - -
2.6718 1555 0.0164 - -
2.6804 1560 0.017 - -
2.6890 1565 0.0146 - -
2.6976 1570 0.0139 - -
2.7062 1575 0.0171 - -
2.7148 1580 0.0137 - -
2.7234 1585 0.008 - -
2.7320 1590 0.0222 - -
2.7405 1595 0.0295 - -
2.7491 1600 0.0178 - -
2.7577 1605 0.0144 - -
2.7663 1610 0.023 - -
2.7749 1615 0.0135 - -
2.7835 1620 0.0213 - -
2.7921 1625 0.0213 - -
2.8007 1630 0.0212 - -
2.8093 1635 0.0164 - -
2.8179 1640 0.0212 - -
2.8265 1645 0.0157 - -
2.8351 1650 0.0251 - -
2.8436 1655 0.0276 - -
2.8522 1660 0.0104 - -
2.8608 1665 0.0123 - -
2.8694 1670 0.0339 - -
2.8780 1675 0.0203 - -
2.8866 1680 0.0171 - -
2.8952 1685 0.0304 - -
2.9038 1690 0.015 - -
2.9124 1695 0.0177 - -
2.9210 1700 0.0176 - -
2.9296 1705 0.0229 - -
2.9381 1710 0.0166 - -
2.9467 1715 0.0185 - -
2.9553 1720 0.017 - -
2.9639 1725 0.0109 - -
2.9725 1730 0.0154 - -
2.9811 1735 0.0226 - -
2.9897 1740 0.0142 - -
2.9983 1745 0.0257 - -
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.1
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.2.1
  • Datasets: 2.14.4
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
13
Safetensors
Model size
334M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for SenhorDasMoscas/bert-ptbr-e3-lr0.0001-04-01-2025

Finetuned
(38)
this model

Evaluation results