Finetuned seamless-m4t-medium for darija speech translation

This model is a fine-tuned version of facebook/seamless-m4t-medium on the Darija-C dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
training_steps: 1000
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Bleu
11.4385	12.5	50	7.7186	0.0
7.0056	25.0	100	4.6583	0.0
4.7721	37.5	150	3.4895	0.0402
3.9618	50.0	200	2.7834	0.0324
3.2307	62.5	250	2.3169	0.0601
2.9341	75.0	300	2.1102	0.0969
2.5517	87.5	350	1.9636	0.1041
2.3681	100.0	400	1.8770	0.0888
2.1031	112.5	450	1.7589	0.1302
2.1191	125.0	500	1.6578	0.1765
1.9185	137.5	550	1.5659	0.1802
1.9021	150.0	600	1.4514	0.4482
2.0155	162.5	650	1.3543	0.3924
1.8151	175.0	700	1.3195	0.3651
1.7461	187.5	750	1.1878	0.4723
1.6512	200.0	800	1.1107	0.5124
1.6378	212.5	850	1.0550	0.5922
1.4851	225.0	900	0.9948	0.6548
1.5016	237.5	950	0.9596	0.6390
1.4868	250.0	1000	0.9487	0.6520