paligemma_clevr

This model is a fine-tuned version of leo009/paligemma-3b-pt-224 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4210

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 3
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 12
  • optimizer: Use adamw_hf with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss
1.463 0.1715 100 0.5191
0.5064 0.3429 200 0.4908
0.4851 0.5144 300 0.4699
0.4797 0.6858 400 0.4539
0.4608 0.8573 500 0.4486
0.4416 1.0274 600 0.4425
0.4253 1.1989 700 0.4316
0.4015 1.3703 800 0.4245
0.3976 1.5418 900 0.4262
0.4092 1.7132 1000 0.4193
0.399 1.8847 1100 0.4210

Framework versions

  • PEFT 0.14.0
  • Transformers 4.48.0
  • Pytorch 2.4.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
9
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model’s pipeline type.

Model tree for SumitAST/paligemma_clevr

Adapter
(6)
this model