_nougat_JawiChar_Jawi

This model is a fine-tuned version of bustamiyusoef/_base_nougat_JawiChar on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.9076

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 6
total_train_batch_size: 48
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	9	2.7102
25.839	2.0	18	2.4453
14.4532	3.0	27	2.3152
13.3167	4.0	36	2.2394
12.7721	5.0	45	2.1704
12.3282	6.0	54	2.1213
11.95	7.0	63	2.0908
11.7647	8.0	72	2.0699
11.5322	9.0	81	2.0403
10.5067	10.0	90	2.0145
10.5067	11.0	99	1.9972
11.144	12.0	108	1.9812
10.9558	13.0	117	1.9762
10.8235	14.0	126	1.9503
10.6902	15.0	135	1.9419
10.4891	16.0	144	1.9322
10.5661	17.0	153	1.9327
10.353	18.0	162	1.9243
10.3232	19.0	171	1.9193
9.5265	20.0	180	1.9114
9.5265	21.0	189	1.9138
10.2046	22.0	198	1.9108
10.1763	23.0	207	1.9073
10.1397	24.0	216	1.9089
10.1327	25.0	225	1.9088
10.1044	26.0	234	1.9076

Framework versions

Transformers 4.47.1
Pytorch 2.5.1+cu121
Datasets 3.2.0
Tokenizers 0.21.0

bustamiyusoef
/

_nougat_JawiChar_Jawi

_nougat_JawiChar_Jawi

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for bustamiyusoef/_nougat_JawiChar_Jawi

Evaluation results