Hyperparameters
learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True
Training Output
global_step=3003,
training_loss=1.8524150695953217,
metrics={'train_runtime': 2319.7329,
'train_samples_per_second': 18.122,
'train_steps_per_second': 1.295,
'total_flos': 9.110291036818637e+16,
'train_loss': 1.8524150695953217,
'epoch': 3.0}
Training Results
Epoch | Training Loss | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Bleu | Gen Len |
---|---|---|---|---|---|---|---|---|
1 | 1.969100 | 1.756651 | 0.159100 | 0.088300 | 0.138800 | 0.138900 | 0.001600 | 20.000000 |
2 | 1.794000 | 1.699691 | 0.158500 | 0.090300 | 0.139500 | 0.139600 | 0.001400 | 20.000000 |
3 | 1.713700 | 1.687554 | 0.162700 | 0.091900 | 0.141800 | 0.141900 | 0.001660 | 20.000000 |
- Downloads last month
- 113
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.