mms-1b-all-lg-GRAIN-v1

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0539
Wer: 0.0650
Cer: 0.0121

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 8
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
0.6472	1.0	1385	0.1179	0.1626	0.0285
0.3081	2.0	2770	0.1075	0.1525	0.0264
0.2974	3.0	4155	0.1057	0.1517	0.0263
0.2874	4.0	5540	0.0980	0.1374	0.0243
0.2816	5.0	6925	0.0988	0.1357	0.0237
0.2749	6.0	8310	0.0925	0.1258	0.0228
0.27	7.0	9695	0.0896	0.1224	0.0218
0.2637	8.0	11080	0.0856	0.1142	0.0207
0.26	9.0	12465	0.0849	0.1200	0.0218
0.2564	10.0	13850	0.0838	0.1079	0.0199
0.2524	11.0	15235	0.0806	0.1100	0.0194
0.2497	12.0	16620	0.0784	0.1115	0.0198
0.2463	13.0	18005	0.0774	0.1069	0.0193
0.2462	14.0	19390	0.0813	0.1083	0.0196
0.2406	15.0	20775	0.0771	0.1021	0.0184
0.2369	16.0	22160	0.0772	0.1017	0.0190
0.235	17.0	23545	0.0740	0.0939	0.0179
0.2313	18.0	24930	0.0735	0.0988	0.0178
0.2297	19.0	26315	0.0743	0.1028	0.0184
0.2265	20.0	27700	0.0724	0.0997	0.0178
0.2229	21.0	29085	0.0728	0.0959	0.0175
0.2205	22.0	30470	0.0709	0.0930	0.0171
0.2194	23.0	31855	0.0677	0.0903	0.0166
0.2159	24.0	33240	0.0681	0.0903	0.0163
0.2155	25.0	34625	0.0694	0.0918	0.0170
0.2133	26.0	36010	0.0679	0.0930	0.0171
0.2103	27.0	37395	0.0713	0.0926	0.0167
0.2076	28.0	38780	0.0665	0.0918	0.0164
0.2067	29.0	40165	0.0679	0.0853	0.0160
0.2068	30.0	41550	0.0640	0.0835	0.0155
0.2035	31.0	42935	0.0644	0.0831	0.0157
0.2015	32.0	44320	0.0646	0.0893	0.0162
0.1993	33.0	45705	0.0656	0.0883	0.0159
0.1976	34.0	47090	0.0637	0.0825	0.0151
0.1957	35.0	48475	0.0620	0.0827	0.0152
0.1944	36.0	49860	0.0616	0.0812	0.0153
0.1929	37.0	51245	0.0604	0.0847	0.0153
0.1899	38.0	52630	0.0624	0.0858	0.0153
0.1897	39.0	54015	0.0621	0.0872	0.0156
0.1888	40.0	55400	0.0609	0.0808	0.0154
0.1872	41.0	56785	0.0627	0.0841	0.0151
0.1845	42.0	58170	0.0602	0.0849	0.0151
0.1857	43.0	59555	0.0629	0.0866	0.0157
0.183	44.0	60940	0.0586	0.0775	0.0143
0.1821	45.0	62325	0.0604	0.0856	0.0152
0.1806	46.0	63710	0.0614	0.0835	0.0148
0.1788	47.0	65095	0.0592	0.0818	0.0146
0.1776	48.0	66480	0.0590	0.0804	0.0147
0.1765	49.0	67865	0.0596	0.0796	0.0148
0.175	50.0	69250	0.0571	0.0787	0.0145
0.1733	51.0	70635	0.0586	0.0806	0.0148
0.1738	52.0	72020	0.0577	0.0789	0.0141
0.1702	53.0	73405	0.0579	0.0764	0.0146
0.1702	54.0	74790	0.0592	0.0766	0.0142
0.1689	55.0	76175	0.0555	0.0742	0.0138
0.1671	56.0	77560	0.0572	0.0773	0.0141
0.1658	57.0	78945	0.0557	0.0766	0.0140
0.1659	58.0	80330	0.0550	0.0769	0.0142
0.1643	59.0	81715	0.0539	0.0735	0.0131
0.1641	60.0	83100	0.0537	0.0754	0.0135
0.1623	61.0	84485	0.0543	0.0746	0.0133
0.1608	62.0	85870	0.0537	0.0698	0.0131
0.1587	63.0	87255	0.0567	0.0756	0.0132
0.1594	64.0	88640	0.0571	0.0740	0.0131
0.1576	65.0	90025	0.0557	0.0723	0.0135
0.1579	66.0	91410	0.0564	0.0760	0.0138
0.1575	67.0	92795	0.0579	0.0729	0.0135
0.1555	68.0	94180	0.0571	0.0742	0.0136
0.1547	69.0	95565	0.0557	0.0680	0.0127
0.1537	70.0	96950	0.0567	0.0708	0.0132
0.1516	71.0	98335	0.0569	0.0713	0.0131
0.1505	72.0	99720	0.0573	0.0723	0.0133
0.1491	73.0	101105	0.0569	0.0688	0.0125
0.1495	74.0	102490	0.0579	0.0760	0.0134
0.1491	75.0	103875	0.0560	0.0750	0.0132
0.1479	76.0	105260	0.0551	0.0684	0.0126
0.1464	77.0	106645	0.0539	0.0696	0.0124
0.145	78.0	108030	0.0555	0.0702	0.0127
0.1456	79.0	109415	0.0539	0.0679	0.0122
0.1438	80.0	110800	0.0547	0.0677	0.0123
0.1437	81.0	112185	0.0524	0.0661	0.0120
0.1418	82.0	113570	0.0531	0.0679	0.0125
0.1419	83.0	114955	0.0548	0.0688	0.0125
0.1419	84.0	116340	0.0529	0.0646	0.0121
0.1396	85.0	117725	0.0526	0.0686	0.0123
0.1393	86.0	119110	0.0527	0.0684	0.0122
0.1385	87.0	120495	0.0543	0.0682	0.0122
0.1368	88.0	121880	0.0536	0.0665	0.0120
0.1359	89.0	123265	0.0550	0.0667	0.0120
0.1365	90.0	124650	0.0515	0.0646	0.0117
0.1359	91.0	126035	0.0541	0.0669	0.0123
0.1348	92.0	127420	0.0521	0.0659	0.0118
0.1357	93.0	128805	0.0537	0.0669	0.0122
0.1337	94.0	130190	0.0542	0.0696	0.0126
0.1327	95.0	131575	0.0546	0.0638	0.0118
0.133	96.0	132960	0.0539	0.0650	0.0121

Framework versions

Transformers 4.47.0
Pytorch 2.1.0+cu118
Datasets 3.1.0
Tokenizers 0.21.0

sulaimank
/

mms-1b-all-lg-GRAIN-v1

mms-1b-all-lg-GRAIN-v1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for sulaimank/mms-1b-all-lg-GRAIN-v1

Collection including sulaimank/mms-1b-all-lg-GRAIN-v1

Grain Models

Evaluation results