Terjman-Nano-v2.1-512

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-ar on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.03
num_epochs: 10

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
4.1858	0.2804	1000	4.9872	1.1697	9.3165
3.7672	0.5609	2000	4.5251	1.6917	11.1082
3.5203	0.8413	3000	4.4067	1.779	10.5894
3.6506	1.1217	4000	4.3579	1.8055	11.7212
3.4325	1.4021	5000	4.3266	1.8151	10.4882
3.4966	1.6826	6000	4.3114	1.8294	10.52
3.4795	1.9630	7000	4.3022	2.0241	10.5553
3.5567	2.2434	8000	4.2977	2.0571	10.3271
3.6008	2.5238	9000	4.2954	2.1029	10.2718
3.5513	2.8043	10000	4.2923	2.0792	10.3929
3.5116	3.0847	11000	4.2898	1.8706	10.3741
3.4962	3.3651	12000	4.2901	2.107	10.4306
3.5444	3.6455	13000	4.2911	2.0825	10.9212
3.4893	3.9260	14000	4.2871	2.1052	10.2388
3.3988	4.2064	15000	4.2871	2.1329	10.2576
3.4946	4.4868	16000	4.2873	2.1086	10.8788
3.4212	4.7672	17000	4.2871	2.0519	11.0012
3.4958	5.0477	18000	4.2865	2.0286	10.8812
3.3869	5.3281	19000	4.2876	2.046	10.4082
3.5321	5.6085	20000	4.2874	2.1578	10.4035
3.4374	5.8890	21000	4.2874	2.0745	10.9247
3.5439	6.1694	22000	4.2880	2.0663	10.3671
3.421	6.4498	23000	4.2870	2.1364	10.8282
3.547	6.7302	24000	4.2872	2.1323	10.8835
3.5297	7.0107	25000	4.2877	2.119	10.9729
3.3617	7.2911	26000	4.2880	2.1283	10.4388
3.511	7.5715	27000	4.2873	2.1401	10.2506
3.3947	7.8519	28000	4.2863	2.1352	10.7718
3.4888	8.1324	29000	4.2877	2.1507	10.8153
3.4712	8.4128	30000	4.2877	2.1401	10.1859
3.3557	8.6932	31000	4.2873	2.0575	11.2671
3.5038	8.9736	32000	4.2879	2.1183	10.4471
3.4788	9.2541	33000	4.2875	2.1325	11.4282
3.5303	9.5345	34000	4.2878	2.1102	10.3012
3.5182	9.8149	35000	4.2875	2.1295	10.2765