wav2vec2-large-xls-r-300m-Mezge

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 30
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
29.8978	1.9242	400	3.4282	1.0
8.319	3.8472	800	0.5659	0.5170
1.6729	5.7702	1200	0.3004	0.2911
0.8499	7.6931	1600	0.2747	0.2542
0.5621	9.6161	2000	0.2893	0.2320
0.4271	11.5391	2400	0.2720	0.2185
0.3494	13.4621	2800	0.2883	0.2143
0.2881	15.3851	3200	0.3053	0.2050
0.2588	17.3081	3600	0.3074	0.1977
0.2287	19.2310	4000	0.3137	0.1924
0.1936	21.1540	4400	0.3121	0.1907
0.1728	23.0770	4800	0.3246	0.1847
0.1567	25.0	5200	0.3268	0.1855
0.1359	26.9242	5600	0.3212	0.1798
0.1199	28.8472	6000	0.3245	0.1777