MiuN2k3
/

mtl-xlmr-large-viwiki-v2

Transformers

Safetensors

roberta

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

MiuN2k3 commited on Jul 31, 2024

Commit

879947c

verified ·

1 Parent(s): bd4cd86

End of training

Browse files

Files changed (2) hide show

README.md +19 -19
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -1,21 +1,21 @@
 ---
 license: mit
-base_model: xlm-roberta-base
 tags:
 - generated_from_trainer
 model-index:
-- name: mtl-xlmr-base-viwiki-v2
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# mtl-xlmr-base-viwiki-v2
-This model is a fine-tuned version of [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6919
 ## Model description
@@ -35,8 +35,8 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 1e-05
-- train_batch_size: 16
-- eval_batch_size: 16
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
@@ -45,18 +45,18 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 0.7756        | 1.0   | 960  | 0.7972          |
-| 0.6104        | 2.0   | 1920 | 0.6775          |
-| 0.5942        | 3.0   | 2880 | 0.6227          |
-| 0.6037        | 4.0   | 3840 | 0.6349          |
-| 0.5208        | 5.0   | 4800 | 0.5975          |
-| 0.347         | 6.0   | 5760 | 0.6008          |
-| 0.415         | 7.0   | 6720 | 0.6142          |
-| 0.3473        | 8.0   | 7680 | 0.6252          |
-| 0.3312        | 9.0   | 8640 | 0.6748          |
-| 0.2134        | 10.0  | 9600 | 0.6919          |
 ### Framework versions

 ---
 license: mit
+base_model: xlm-roberta-large
 tags:
 - generated_from_trainer
 model-index:
+- name: mtl-xlmr-large-viwiki-v2
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# mtl-xlmr-large-viwiki-v2
+This model is a fine-tuned version of [xlm-roberta-large](https://huggingface.co/xlm-roberta-large) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.6167
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 1e-05
+- train_batch_size: 8
+- eval_batch_size: 4
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 ### Training results
+| Training Loss | Epoch | Step  | Validation Loss |
+|:-------------:|:-----:|:-----:|:---------------:|
+| 0.5611        | 1.0   | 1920  | 0.5882          |
+| 0.3039        | 2.0   | 3840  | 0.5782          |
+| 0.2045        | 3.0   | 5760  | 0.5083          |
+| 0.2969        | 4.0   | 7680  | 0.7146          |
+| 0.0895        | 5.0   | 9600  | 0.8017          |
+| 0.0781        | 6.0   | 11520 | 1.0214          |
+| 0.0002        | 7.0   | 13440 | 1.1289          |
+| 0.0029        | 8.0   | 15360 | 1.4217          |
+| 0.041         | 9.0   | 17280 | 1.5223          |
+| 0.0           | 10.0  | 19200 | 1.6167          |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8f5b487b16ec0678ba65e3b4143299fd341a7b4804bcf1b022cb4ff0a72d568b
 size 2235424492

 version https://git-lfs.github.com/spec/v1
+oid sha256:a78132a1676af0bcb34260650a05b9271ff9518387f30911cfa32a422de86fcb
 size 2235424492