kiranpantha/whisper-large-v3-nepali

This model is a fine-tuned version of kiranpantha/whisper-large-v3-nepali on the OpenSLR54 dataset. It achieves the following results on the evaluation set:

Loss: 0.4213

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 10
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	14	0.3574
0.6204	2.0	28	0.3134
0.6204	3.0	42	0.3488
0.1212	4.0	56	0.3598
0.1212	5.0	70	0.4100
0.0805	6.0	84	0.3659
0.0805	7.0	98	0.3481
0.063	8.0	112	0.3967
0.0304	9.0	126	0.3466
0.0304	10.0	140	0.3495
0.0287	11.0	154	0.3672
0.0287	12.0	168	0.3762
0.0173	13.0	182	0.3518
0.0173	14.0	196	0.3574
0.0205	15.0	210	0.3874
0.0205	16.0	224	0.3449
0.0127	17.0	238	0.3697
0.0138	18.0	252	0.3428
0.0138	19.0	266	0.3290
0.0096	20.0	280	0.3459
0.0096	21.0	294	0.3702
0.0105	22.0	308	0.3467
0.0105	23.0	322	0.3903
0.0046	24.0	336	0.3670
0.0019	25.0	350	0.3859
0.0019	26.0	364	0.3900
0.0006	27.0	378	0.3857
0.0006	28.0	392	0.3836
0.0004	29.0	406	0.3851
0.0004	30.0	420	0.3864
0.0003	31.0	434	0.3878
0.0003	32.0	448	0.3899
0.0003	33.0	462	0.3904
0.0003	34.0	476	0.3917
0.0003	35.0	490	0.3925
0.0002	36.0	504	0.3939
0.0002	37.0	518	0.3946
0.0002	38.0	532	0.3957
0.0002	39.0	546	0.3958
0.0002	40.0	560	0.3966
0.0002	41.0	574	0.3977
0.0002	42.0	588	0.3984
0.0002	43.0	602	0.3990
0.0002	44.0	616	0.3998
0.0002	45.0	630	0.4008
0.0002	46.0	644	0.4013
0.0002	47.0	658	0.4016
0.0002	48.0	672	0.4023
0.0002	49.0	686	0.4028
0.0002	50.0	700	0.4034
0.0002	51.0	714	0.4041
0.0001	52.0	728	0.4048
0.0001	53.0	742	0.4054
0.0001	54.0	756	0.4060
0.0001	55.0	770	0.4064
0.0001	56.0	784	0.4072
0.0001	57.0	798	0.4077
0.0001	58.0	812	0.4081
0.0001	59.0	826	0.4088
0.0001	60.0	840	0.4097
0.0001	61.0	854	0.4104
0.0001	62.0	868	0.4107
0.0001	63.0	882	0.4114
0.0001	64.0	896	0.4122
0.0001	65.0	910	0.4122
0.0001	66.0	924	0.4128
0.0001	67.0	938	0.4136
0.0001	68.0	952	0.4138
0.0001	69.0	966	0.4141
0.0001	70.0	980	0.4144
0.0001	71.0	994	0.4150
0.0001	72.0	1008	0.4155
0.0001	73.0	1022	0.4156
0.0001	74.0	1036	0.4161
0.0001	75.0	1050	0.4163
0.0001	76.0	1064	0.4167
0.0001	77.0	1078	0.4171
0.0001	78.0	1092	0.4174
0.0001	79.0	1106	0.4176
0.0001	80.0	1120	0.4177
0.0001	81.0	1134	0.4179
0.0001	82.0	1148	0.4183
0.0001	83.0	1162	0.4186
0.0001	84.0	1176	0.4187
0.0001	85.0	1190	0.4192
0.0001	86.0	1204	0.4194
0.0001	87.0	1218	0.4197
0.0001	88.0	1232	0.4198
0.0001	89.0	1246	0.4201
0.0001	90.0	1260	0.4204
0.0001	91.0	1274	0.4205
0.0001	92.0	1288	0.4206
0.0001	93.0	1302	0.4208
0.0001	94.0	1316	0.4208
0.0001	95.0	1330	0.4209
0.0001	96.0	1344	0.4211
0.0001	97.0	1358	0.4211
0.0001	98.0	1372	0.4211
0.0001	99.0	1386	0.4212
0.0001	100.0	1400	0.4213

Framework versions

PEFT 0.14.0
Transformers 4.47.1
Pytorch 2.5.1+cxx11.abi
Datasets 3.2.0
Tokenizers 0.21.0

kiranpantha
/

whisper-large-v3-nepali-dora-qkv

kiranpantha/whisper-large-v3-nepali

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for kiranpantha/whisper-large-v3-nepali-dora-qkv

Evaluation results