owanr
/

SChem5Labels-google-t5-v1_1-large-intra_model-shuffle-human_annots_str

Generated from Trainer

Model card Files Files and versions Community

SChem5Labels-google-t5-v1_1-large-intra_model-shuffle-human_annots_str / README.md

owanr's picture

End of training

3a0b011 about 1 year ago

|

history blame contribute delete

3.14 kB

	---
	license: apache-2.0
	base_model: google/t5-v1_1-large
	tags:
	- generated_from_trainer
	model-index:
	- name: SChem5Labels-google-t5-v1_1-large-intra_model-shuffle-human_annots_str
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# SChem5Labels-google-t5-v1_1-large-intra_model-shuffle-human_annots_str

	This model is a fine-tuned version of [google/t5-v1_1-large](https://huggingface.co/google/t5-v1_1-large) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.2617

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 128
	- eval_batch_size: 128
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 200

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 20.6863 \| 1.0 \| 25 \| 23.6720 \|
	\| 19.4335 \| 2.0 \| 50 \| 19.9692 \|
	\| 18.2125 \| 3.0 \| 75 \| 14.6175 \|
	\| 16.4919 \| 4.0 \| 100 \| 11.6548 \|
	\| 15.9533 \| 5.0 \| 125 \| 10.4693 \|
	\| 14.4388 \| 6.0 \| 150 \| 9.6944 \|
	\| 12.4697 \| 7.0 \| 175 \| 9.3650 \|
	\| 10.5569 \| 8.0 \| 200 \| 9.1423 \|
	\| 9.2725 \| 9.0 \| 225 \| 9.0209 \|
	\| 8.4222 \| 10.0 \| 250 \| 8.8843 \|
	\| 8.3219 \| 11.0 \| 275 \| 8.8191 \|
	\| 8.3309 \| 12.0 \| 300 \| 8.7311 \|
	\| 8.1401 \| 13.0 \| 325 \| 8.5687 \|
	\| 8.0179 \| 14.0 \| 350 \| 8.2683 \|
	\| 7.7326 \| 15.0 \| 375 \| 7.9618 \|
	\| 7.606 \| 16.0 \| 400 \| 7.7520 \|
	\| 7.4445 \| 17.0 \| 425 \| 7.6258 \|
	\| 7.2501 \| 18.0 \| 450 \| 7.5502 \|
	\| 7.2915 \| 19.0 \| 475 \| 7.5063 \|
	\| 7.2094 \| 20.0 \| 500 \| 7.4555 \|
	\| 7.0879 \| 21.0 \| 525 \| 7.3983 \|
	\| 7.1268 \| 22.0 \| 550 \| 7.3460 \|
	\| 6.623 \| 23.0 \| 575 \| 0.9955 \|
	\| 1.0983 \| 24.0 \| 600 \| 0.9945 \|
	\| 1.0196 \| 25.0 \| 625 \| 0.9727 \|
	\| 0.9822 \| 26.0 \| 650 \| 0.9680 \|
	\| 0.9827 \| 27.0 \| 675 \| 0.9678 \|
	\| 0.9832 \| 28.0 \| 700 \| 0.9646 \|
	\| 0.9983 \| 29.0 \| 725 \| 0.9679 \|
	\| 0.982 \| 30.0 \| 750 \| 0.9610 \|
	\| 1.0109 \| 31.0 \| 775 \| 0.9632 \|
	\| 0.9972 \| 32.0 \| 800 \| 0.9628 \|
	\| 0.9906 \| 33.0 \| 825 \| 0.9634 \|
	\| 0.9886 \| 34.0 \| 850 \| 0.9634 \|
	\| 0.9811 \| 35.0 \| 875 \| 0.9649 \|


	### Framework versions

	- Transformers 4.34.0
	- Pytorch 2.1.0+cu121
	- Datasets 2.6.1
	- Tokenizers 0.14.1