mms-1b-toigen-male-model

This model is a fine-tuned version of facebook/mms-1b-all on the TOIGEN - TOI dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4348
  • Wer: 0.3988

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 30.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
8.0084 0.5051 100 3.7458 0.9984
2.8136 1.0101 200 1.0029 0.7422
0.9647 1.5152 300 0.5870 0.5274
0.8539 2.0202 400 0.5461 0.5048
0.7525 2.5253 500 0.5256 0.4989
0.7307 3.0303 600 0.5101 0.4871
0.6997 3.5354 700 0.5032 0.4688
0.6882 4.0404 800 0.4879 0.4736
0.651 4.5455 900 0.4788 0.4559
0.6623 5.0505 1000 0.4799 0.4526
0.6339 5.5556 1100 0.4677 0.4419
0.6424 6.0606 1200 0.4650 0.4429
0.6365 6.5657 1300 0.4746 0.4462
0.556 7.0707 1400 0.4512 0.4381
0.5969 7.5758 1500 0.4597 0.4413
0.5772 8.0808 1600 0.4455 0.4284
0.5695 8.5859 1700 0.4565 0.4268
0.5752 9.0909 1800 0.4414 0.4187
0.5734 9.5960 1900 0.4450 0.4085
0.5465 10.1010 2000 0.4373 0.4155
0.5553 10.6061 2100 0.4520 0.4241
0.5289 11.1111 2200 0.4306 0.4085
0.5122 11.6162 2300 0.4372 0.4015
0.5659 12.1212 2400 0.4408 0.4010
0.5007 12.6263 2500 0.4274 0.3983
0.5366 13.1313 2600 0.4266 0.4026
0.5068 13.6364 2700 0.4366 0.3961
0.507 14.1414 2800 0.4359 0.3972
0.5031 14.6465 2900 0.4334 0.3967
0.4949 15.1515 3000 0.4348 0.3988

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
28
Safetensors
Model size
965M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for csikasote/mms-1b-toigen-male-model

Finetuned
(213)
this model