sst5-t5-base-kd / README.md
kennethge123's picture
Upload README.md with huggingface_hub
75ed782 verified
|
raw
history blame
663 Bytes
metadata
language: en
license: mit
library_name: pytorch

Plainly Optimized Network

Dataset: BIGBENCH

Trainer Hyperparameters:

  • lr = 5e-05
  • per_device_batch_size = 1
  • gradient_accumulation_steps = 4
  • weight_decay = 1e-09
  • seed = 42
eval_loss eval_accuracy epoch
58.940 0.054 1.0
54.182 0.049 2.0
56.362 0.051 3.0
52.705 0.046 4.0
55.357 0.050 5.0
53.973 0.048 6.0
56.034 0.050 7.0
51.731 0.045 8.0
54.661 0.048 9.0
50.378 0.043 10.0
51.579 0.044 11.0
51.193 0.044 12.0
52.724 0.046 13.0
52.055 0.045 14.0
51.406 0.044 15.0
51.539 0.045 16.0
52.422 0.046 17.0
50.304 0.043 18.0
50.937 0.044 19.0