SentenceTransformer based on neuralmind/bert-large-portuguese-cased

This is a sentence-transformers model finetuned from neuralmind/bert-large-portuguese-cased. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: neuralmind/bert-large-portuguese-cased
Maximum Sequence Length: 512 tokens
Output Dimensionality: 1024 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("SenhorDasMoscas/bert-ptbr-e3-lr0.0001-04-01-2025")
# Run inference
sentences = [
    'cobertor pelucia',
    'moda acessorio',
    'servico reparo eletronico',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Dataset: eval-similarity
Evaluated with EmbeddingSimilarityEvaluator

Metric	Value
pearson_cosine	0.9058
spearman_cosine	0.8399

Training Details

Training Dataset

Unnamed Dataset

Size: 18,623 training samples
Columns: text1, text2, and label
Approximate statistics based on the first 1000 samples:
text1 text2 label
type string string float
details
min: 3 tokens
mean: 7.67 tokens
max: 17 tokens

min: 3 tokens
mean: 6.58 tokens
max: 11 tokens

min: 0.1
mean: 0.54
max: 1.0
Samples:

text1 text2 label

tabua carne casa decoracao 1.0

caminhaor basculante brinquedo brinquedo jogo educativo 1.0

buscar mochila escolar crianca comida rapido fastfood 0.1

	text1	text2	label
type	string	string	float
details	min: 3 tokens mean: 7.67 tokens max: 17 tokens	min: 3 tokens mean: 6.58 tokens max: 11 tokens	min: 0.1 mean: 0.54 max: 1.0

text1	text2	label
`tabua carne`	`casa decoracao`	`1.0`
`caminhaor basculante brinquedo`	`brinquedo jogo educativo`	`1.0`
`buscar mochila escolar crianca`	`comida rapido fastfood`	`0.1`

Loss: CosineSimilarityLoss with these parameters:

{
    "loss_fct": "torch.nn.modules.loss.MSELoss"
}

Evaluation Dataset

Unnamed Dataset

Size: 2,070 evaluation samples
Columns: text1, text2, and label
Approximate statistics based on the first 1000 samples:
text1 text2 label
type string string float
details
min: 3 tokens
mean: 7.69 tokens
max: 17 tokens

min: 3 tokens
mean: 6.54 tokens
max: 11 tokens

min: 0.1
mean: 0.59
max: 1.0
Samples:

text1 text2 label

preciso pao frances integral padaria confeitaria 1.0

onde poder comprar microfone joia bijuterio 0.1

chuveiro eletrico lorenzetti livro material literario 0.1

	text1	text2	label
type	string	string	float
details	min: 3 tokens mean: 7.69 tokens max: 17 tokens	min: 3 tokens mean: 6.54 tokens max: 11 tokens	min: 0.1 mean: 0.59 max: 1.0

text1	text2	label
`preciso pao frances integral`	`padaria confeitaria`	`1.0`
`onde poder comprar microfone`	`joia bijuterio`	`0.1`
`chuveiro eletrico lorenzetti`	`livro material literario`	`0.1`

Loss: CosineSimilarityLoss with these parameters:

{
    "loss_fct": "torch.nn.modules.loss.MSELoss"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 32
per_device_eval_batch_size: 32
learning_rate: 0.0001
weight_decay: 0.1
warmup_ratio: 0.1
warmup_steps: 232
fp16: True
load_best_model_at_end: True

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 32
per_device_eval_batch_size: 32
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 0.0001
weight_decay: 0.1
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 3
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 232
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: proportional

Training Logs

Click to expand

Epoch	Step	Training Loss	Validation Loss	eval-similarity_spearman_cosine
0.0086	5	0.2031	-	-
0.0172	10	0.2078	-	-
0.0258	15	0.2062	-	-
0.0344	20	0.1693	-	-
0.0430	25	0.1681	-	-
0.0515	30	0.1639	-	-
0.0601	35	0.1393	-	-
0.0687	40	0.1675	-	-
0.0773	45	0.1297	-	-
0.0859	50	0.1223	-	-
0.0945	55	0.1203	-	-
0.1031	60	0.0942	-	-
0.1117	65	0.0922	-	-
0.1203	70	0.097	-	-
0.1289	75	0.0927	-	-
0.1375	80	0.0961	-	-
0.1460	85	0.0821	-	-
0.1546	90	0.0621	-	-
0.1632	95	0.084	-	-
0.1718	100	0.0706	-	-
0.1804	105	0.0701	-	-
0.1890	110	0.0828	-	-
0.1976	115	0.078	-	-
0.2062	120	0.0745	-	-
0.2148	125	0.0744	-	-
0.2234	130	0.0785	-	-
0.2320	135	0.0745	-	-
0.2405	140	0.0615	-	-
0.2491	145	0.0665	-	-
0.2577	150	0.0873	-	-
0.2663	155	0.0916	-	-
0.2749	160	0.0659	-	-
0.2835	165	0.0896	-	-
0.2921	170	0.0807	-	-
0.3007	175	0.0745	-	-
0.3093	180	0.0794	-	-
0.3179	185	0.0703	-	-
0.3265	190	0.0705	-	-
0.3351	195	0.084	-	-
0.3436	200	0.0671	-	-
0.3522	205	0.076	-	-
0.3608	210	0.0821	-	-
0.3694	215	0.0499	-	-
0.3780	220	0.0729	-	-
0.3866	225	0.0697	-	-
0.3952	230	0.085	-	-
0.4038	235	0.0835	-	-
0.4124	240	0.0743	-	-
0.4210	245	0.0714	-	-
0.4296	250	0.0597	-	-
0.4381	255	0.0626	-	-
0.4467	260	0.0522	-	-
0.4553	265	0.0734	-	-
0.4639	270	0.0616	-	-
0.4725	275	0.0463	-	-
0.4811	280	0.0631	-	-
0.4897	285	0.0672	-	-
0.4983	290	0.0725	-	-
0.5069	295	0.043	-	-
0.5155	300	0.0675	0.0698	0.7861
0.5241	305	0.0837	-	-
0.5326	310	0.0785	-	-
0.5412	315	0.0761	-	-
0.5498	320	0.0523	-	-
0.5584	325	0.0514	-	-
0.5670	330	0.0726	-	-
0.5756	335	0.0584	-	-
0.5842	340	0.0736	-	-
0.5928	345	0.0705	-	-
0.6014	350	0.0682	-	-
0.6100	355	0.0636	-	-
0.6186	360	0.0484	-	-
0.6271	365	0.0524	-	-
0.6357	370	0.0657	-	-
0.6443	375	0.0766	-	-
0.6529	380	0.0759	-	-
0.6615	385	0.071	-	-
0.6701	390	0.055	-	-
0.6787	395	0.0466	-	-
0.6873	400	0.0697	-	-
0.6959	405	0.0546	-	-
0.7045	410	0.0692	-	-
0.7131	415	0.0519	-	-
0.7216	420	0.0521	-	-
0.7302	425	0.0449	-	-
0.7388	430	0.0646	-	-
0.7474	435	0.0585	-	-
0.7560	440	0.0536	-	-
0.7646	445	0.0592	-	-
0.7732	450	0.0515	-	-
0.7818	455	0.0676	-	-
0.7904	460	0.0732	-	-
0.7990	465	0.0618	-	-
0.8076	470	0.0579	-	-
0.8162	475	0.0516	-	-
0.8247	480	0.0659	-	-
0.8333	485	0.0583	-	-
0.8419	490	0.0624	-	-
0.8505	495	0.0667	-	-
0.8591	500	0.052	-	-
0.8677	505	0.0858	-	-
0.8763	510	0.0441	-	-
0.8849	515	0.0592	-	-
0.8935	520	0.0532	-	-
0.9021	525	0.0478	-	-
0.9107	530	0.062	-	-
0.9192	535	0.0487	-	-
0.9278	540	0.0704	-	-
0.9364	545	0.0467	-	-
0.9450	550	0.0482	-	-
0.9536	555	0.0796	-	-
0.9622	560	0.0568	-	-
0.9708	565	0.0588	-	-
0.9794	570	0.0514	-	-
0.9880	575	0.0543	-	-
0.9966	580	0.0568	-	-
1.0052	585	0.0513	-	-
1.0137	590	0.0361	-	-
1.0223	595	0.0405	-	-
1.0309	600	0.0347	0.0491	0.8180
1.0395	605	0.0459	-	-
1.0481	610	0.0557	-	-
1.0567	615	0.0447	-	-
1.0653	620	0.0279	-	-
1.0739	625	0.0417	-	-
1.0825	630	0.025	-	-
1.0911	635	0.0399	-	-
1.0997	640	0.0466	-	-
1.1082	645	0.0294	-	-
1.1168	650	0.035	-	-
1.1254	655	0.0376	-	-
1.1340	660	0.0414	-	-
1.1426	665	0.0502	-	-
1.1512	670	0.04	-	-
1.1598	675	0.0385	-	-
1.1684	680	0.0286	-	-
1.1770	685	0.0361	-	-
1.1856	690	0.0282	-	-
1.1942	695	0.0473	-	-
1.2027	700	0.0346	-	-
1.2113	705	0.0295	-	-
1.2199	710	0.0283	-	-
1.2285	715	0.0301	-	-
1.2371	720	0.0565	-	-
1.2457	725	0.0325	-	-
1.2543	730	0.0299	-	-
1.2629	735	0.0417	-	-
1.2715	740	0.0398	-	-
1.2801	745	0.0477	-	-
1.2887	750	0.0418	-	-
1.2973	755	0.034	-	-
1.3058	760	0.0397	-	-
1.3144	765	0.0308	-	-
1.3230	770	0.0457	-	-
1.3316	775	0.0328	-	-
1.3402	780	0.0222	-	-
1.3488	785	0.0246	-	-
1.3574	790	0.0229	-	-
1.3660	795	0.0351	-	-
1.3746	800	0.0415	-	-
1.3832	805	0.0351	-	-
1.3918	810	0.0269	-	-
1.4003	815	0.0307	-	-
1.4089	820	0.0381	-	-
1.4175	825	0.0425	-	-
1.4261	830	0.0557	-	-
1.4347	835	0.0523	-	-
1.4433	840	0.0488	-	-
1.4519	845	0.0355	-	-
1.4605	850	0.0403	-	-
1.4691	855	0.0332	-	-
1.4777	860	0.0427	-	-
1.4863	865	0.0348	-	-
1.4948	870	0.0375	-	-
1.5034	875	0.0271	-	-
1.5120	880	0.0428	-	-
1.5206	885	0.0666	-	-
1.5292	890	0.0491	-	-
1.5378	895	0.0424	-	-
1.5464	900	0.0413	0.0418	0.8326
1.5550	905	0.0469	-	-
1.5636	910	0.0288	-	-
1.5722	915	0.0541	-	-
1.5808	920	0.017	-	-
1.5893	925	0.0505	-	-
1.5979	930	0.0341	-	-
1.6065	935	0.0223	-	-
1.6151	940	0.0469	-	-
1.6237	945	0.0386	-	-
1.6323	950	0.0214	-	-
1.6409	955	0.0329	-	-
1.6495	960	0.0398	-	-
1.6581	965	0.0355	-	-
1.6667	970	0.0373	-	-
1.6753	975	0.0339	-	-
1.6838	980	0.0349	-	-
1.6924	985	0.0439	-	-
1.7010	990	0.0425	-	-
1.7096	995	0.0318	-	-
1.7182	1000	0.025	-	-
1.7268	1005	0.0334	-	-
1.7354	1010	0.0327	-	-
1.7440	1015	0.0356	-	-
1.7526	1020	0.0428	-	-
1.7612	1025	0.0432	-	-
1.7698	1030	0.0334	-	-
1.7784	1035	0.032	-	-
1.7869	1040	0.0318	-	-
1.7955	1045	0.0281	-	-
1.8041	1050	0.0231	-	-
1.8127	1055	0.0436	-	-
1.8213	1060	0.0303	-	-
1.8299	1065	0.0489	-	-
1.8385	1070	0.0292	-	-
1.8471	1075	0.06	-	-
1.8557	1080	0.0329	-	-
1.8643	1085	0.0322	-	-
1.8729	1090	0.0426	-	-
1.8814	1095	0.0263	-	-
1.8900	1100	0.024	-	-
1.8986	1105	0.0228	-	-
1.9072	1110	0.0313	-	-
1.9158	1115	0.044	-	-
1.9244	1120	0.036	-	-
1.9330	1125	0.0252	-	-
1.9416	1130	0.0311	-	-
1.9502	1135	0.0452	-	-
1.9588	1140	0.0338	-	-
1.9674	1145	0.0447	-	-
1.9759	1150	0.0318	-	-
1.9845	1155	0.0428	-	-
1.9931	1160	0.03	-	-
2.0017	1165	0.0314	-	-
2.0103	1170	0.0181	-	-
2.0189	1175	0.0137	-	-
2.0275	1180	0.0242	-	-
2.0361	1185	0.03	-	-
2.0447	1190	0.0267	-	-
2.0533	1195	0.0263	-	-
2.0619	1200	0.0219	0.0392	0.8360
2.0704	1205	0.0189	-	-
2.0790	1210	0.0193	-	-
2.0876	1215	0.0345	-	-
2.0962	1220	0.0136	-	-
2.1048	1225	0.0346	-	-
2.1134	1230	0.0163	-	-
2.1220	1235	0.0264	-	-
2.1306	1240	0.0172	-	-
2.1392	1245	0.0163	-	-
2.1478	1250	0.0226	-	-
2.1564	1255	0.0229	-	-
2.1649	1260	0.0185	-	-
2.1735	1265	0.0134	-	-
2.1821	1270	0.0144	-	-
2.1907	1275	0.0215	-	-
2.1993	1280	0.0291	-	-
2.2079	1285	0.0305	-	-
2.2165	1290	0.0192	-	-
2.2251	1295	0.0272	-	-
2.2337	1300	0.0267	-	-
2.2423	1305	0.0265	-	-
2.2509	1310	0.0207	-	-
2.2595	1315	0.0305	-	-
2.2680	1320	0.0292	-	-
2.2766	1325	0.017	-	-
2.2852	1330	0.0242	-	-
2.2938	1335	0.016	-	-
2.3024	1340	0.0241	-	-
2.3110	1345	0.0193	-	-
2.3196	1350	0.0134	-	-
2.3282	1355	0.0206	-	-
2.3368	1360	0.0218	-	-
2.3454	1365	0.0239	-	-
2.3540	1370	0.0314	-	-
2.3625	1375	0.028	-	-
2.3711	1380	0.021	-	-
2.3797	1385	0.0179	-	-
2.3883	1390	0.0173	-	-
2.3969	1395	0.0228	-	-
2.4055	1400	0.0217	-	-
2.4141	1405	0.0243	-	-
2.4227	1410	0.018	-	-
2.4313	1415	0.0233	-	-
2.4399	1420	0.016	-	-
2.4485	1425	0.0308	-	-
2.4570	1430	0.0239	-	-
2.4656	1435	0.018	-	-
2.4742	1440	0.016	-	-
2.4828	1445	0.0189	-	-
2.4914	1450	0.0215	-	-
2.5	1455	0.027	-	-
2.5086	1460	0.0177	-	-
2.5172	1465	0.0325	-	-
2.5258	1470	0.0136	-	-
2.5344	1475	0.0235	-	-
2.5430	1480	0.0362	-	-
2.5515	1485	0.0302	-	-
2.5601	1490	0.0137	-	-
2.5687	1495	0.0162	-	-
2.5773	1500	0.0174	0.0376	0.8399
2.5859	1505	0.0248	-	-
2.5945	1510	0.0131	-	-
2.6031	1515	0.0188	-	-
2.6117	1520	0.011	-	-
2.6203	1525	0.0174	-	-
2.6289	1530	0.0192	-	-
2.6375	1535	0.0113	-	-
2.6460	1540	0.0304	-	-
2.6546	1545	0.0217	-	-
2.6632	1550	0.0102	-	-
2.6718	1555	0.0164	-	-
2.6804	1560	0.017	-	-
2.6890	1565	0.0146	-	-
2.6976	1570	0.0139	-	-
2.7062	1575	0.0171	-	-
2.7148	1580	0.0137	-	-
2.7234	1585	0.008	-	-
2.7320	1590	0.0222	-	-
2.7405	1595	0.0295	-	-
2.7491	1600	0.0178	-	-
2.7577	1605	0.0144	-	-
2.7663	1610	0.023	-	-
2.7749	1615	0.0135	-	-
2.7835	1620	0.0213	-	-
2.7921	1625	0.0213	-	-
2.8007	1630	0.0212	-	-
2.8093	1635	0.0164	-	-
2.8179	1640	0.0212	-	-
2.8265	1645	0.0157	-	-
2.8351	1650	0.0251	-	-
2.8436	1655	0.0276	-	-
2.8522	1660	0.0104	-	-
2.8608	1665	0.0123	-	-
2.8694	1670	0.0339	-	-
2.8780	1675	0.0203	-	-
2.8866	1680	0.0171	-	-
2.8952	1685	0.0304	-	-
2.9038	1690	0.015	-	-
2.9124	1695	0.0177	-	-
2.9210	1700	0.0176	-	-
2.9296	1705	0.0229	-	-
2.9381	1710	0.0166	-	-
2.9467	1715	0.0185	-	-
2.9553	1720	0.017	-	-
2.9639	1725	0.0109	-	-
2.9725	1730	0.0154	-	-
2.9811	1735	0.0226	-	-
2.9897	1740	0.0142	-	-
2.9983	1745	0.0257	-	-

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.12
Sentence Transformers: 3.3.1
Transformers: 4.47.1
PyTorch: 2.5.1+cu121
Accelerate: 1.2.1
Datasets: 2.14.4
Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

SenhorDasMoscas
/

bert-ptbr-e3-lr0.0001-04-01-2025

SentenceTransformer based on neuralmind/bert-large-portuguese-cased

Model Details

Model Description

Model Sources

Full Model Architecture

Usage

Direct Usage (Sentence Transformers)

Evaluation

Metrics

Semantic Similarity

Training Details

Training Dataset

Unnamed Dataset

Evaluation Dataset

Unnamed Dataset

Training Hyperparameters

Non-Default Hyperparameters

All Hyperparameters

Training Logs

Framework Versions

Citation

BibTeX

Sentence Transformers

Model tree for SenhorDasMoscas/bert-ptbr-e3-lr0.0001-04-01-2025

Evaluation results