Terjman-Nano-v2.1-512

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-ar on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.2875
  • Bleu: 2.1295
  • Gen Len: 10.2765

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
4.1858 0.2804 1000 4.9872 1.1697 9.3165
3.7672 0.5609 2000 4.5251 1.6917 11.1082
3.5203 0.8413 3000 4.4067 1.779 10.5894
3.6506 1.1217 4000 4.3579 1.8055 11.7212
3.4325 1.4021 5000 4.3266 1.8151 10.4882
3.4966 1.6826 6000 4.3114 1.8294 10.52
3.4795 1.9630 7000 4.3022 2.0241 10.5553
3.5567 2.2434 8000 4.2977 2.0571 10.3271
3.6008 2.5238 9000 4.2954 2.1029 10.2718
3.5513 2.8043 10000 4.2923 2.0792 10.3929
3.5116 3.0847 11000 4.2898 1.8706 10.3741
3.4962 3.3651 12000 4.2901 2.107 10.4306
3.5444 3.6455 13000 4.2911 2.0825 10.9212
3.4893 3.9260 14000 4.2871 2.1052 10.2388
3.3988 4.2064 15000 4.2871 2.1329 10.2576
3.4946 4.4868 16000 4.2873 2.1086 10.8788
3.4212 4.7672 17000 4.2871 2.0519 11.0012
3.4958 5.0477 18000 4.2865 2.0286 10.8812
3.3869 5.3281 19000 4.2876 2.046 10.4082
3.5321 5.6085 20000 4.2874 2.1578 10.4035
3.4374 5.8890 21000 4.2874 2.0745 10.9247
3.5439 6.1694 22000 4.2880 2.0663 10.3671
3.421 6.4498 23000 4.2870 2.1364 10.8282
3.547 6.7302 24000 4.2872 2.1323 10.8835
3.5297 7.0107 25000 4.2877 2.119 10.9729
3.3617 7.2911 26000 4.2880 2.1283 10.4388
3.511 7.5715 27000 4.2873 2.1401 10.2506
3.3947 7.8519 28000 4.2863 2.1352 10.7718
3.4888 8.1324 29000 4.2877 2.1507 10.8153
3.4712 8.4128 30000 4.2877 2.1401 10.1859
3.3557 8.6932 31000 4.2873 2.0575 11.2671
3.5038 8.9736 32000 4.2879 2.1183 10.4471
3.4788 9.2541 33000 4.2875 2.1325 11.4282
3.5303 9.5345 34000 4.2878 2.1102 10.3012
3.5182 9.8149 35000 4.2875 2.1295 10.2765

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
2
Safetensors
Model size
76.4M params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for BounharAbdelaziz/Terjman-Nano-v2.1-512

Finetuned
(15)
this model