End of training
Browse files- README.md +18 -6
- adapter_model.bin +1 -1
README.md
CHANGED
@@ -46,7 +46,7 @@ datasets:
|
|
46 |
# output:
|
47 |
|
48 |
test_datasets:
|
49 |
-
- path: data/
|
50 |
ds_type: json
|
51 |
# You need to specify a split. For "json" datasets the default split is called "train".
|
52 |
split: train
|
@@ -116,7 +116,7 @@ xformers_attention: null
|
|
116 |
|
117 |
This model is a fine-tuned version of [TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T) on the None dataset.
|
118 |
It achieves the following results on the evaluation set:
|
119 |
-
- Loss:
|
120 |
|
121 |
## Model description
|
122 |
|
@@ -148,10 +148,22 @@ The following hyperparameters were used during training:
|
|
148 |
|
149 |
| Training Loss | Epoch | Step | Validation Loss |
|
150 |
|:-------------:|:-----:|:----:|:---------------:|
|
151 |
-
|
|
152 |
-
|
|
153 |
-
|
|
154 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
155 |
|
156 |
|
157 |
### Framework versions
|
|
|
46 |
# output:
|
47 |
|
48 |
test_datasets:
|
49 |
+
- path: data/eval.jsonl
|
50 |
ds_type: json
|
51 |
# You need to specify a split. For "json" datasets the default split is called "train".
|
52 |
split: train
|
|
|
116 |
|
117 |
This model is a fine-tuned version of [TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T) on the None dataset.
|
118 |
It achieves the following results on the evaluation set:
|
119 |
+
- Loss: 1.5572
|
120 |
|
121 |
## Model description
|
122 |
|
|
|
148 |
|
149 |
| Training Loss | Epoch | Step | Validation Loss |
|
150 |
|:-------------:|:-----:|:----:|:---------------:|
|
151 |
+
| 6.4934 | 0.25 | 1 | 2.0690 |
|
152 |
+
| 2.5023 | 0.5 | 2 | 2.0673 |
|
153 |
+
| 4.9022 | 0.75 | 3 | 2.0621 |
|
154 |
+
| 5.6912 | 1.0 | 4 | 2.0491 |
|
155 |
+
| 5.1317 | 1.25 | 5 | 2.0230 |
|
156 |
+
| 5.5762 | 1.25 | 6 | 1.9738 |
|
157 |
+
| 3.3504 | 1.5 | 7 | 1.9053 |
|
158 |
+
| 5.1877 | 1.75 | 8 | 1.8346 |
|
159 |
+
| 3.8815 | 2.0 | 9 | 1.7862 |
|
160 |
+
| 3.5814 | 2.25 | 10 | 1.7475 |
|
161 |
+
| 3.3579 | 2.25 | 11 | 1.6987 |
|
162 |
+
| 3.5511 | 2.5 | 12 | 1.6555 |
|
163 |
+
| 3.3339 | 2.75 | 13 | 1.6107 |
|
164 |
+
| 2.8774 | 3.0 | 14 | 1.5778 |
|
165 |
+
| 3.1427 | 3.25 | 15 | 1.5620 |
|
166 |
+
| 3.3465 | 3.25 | 16 | 1.5572 |
|
167 |
|
168 |
|
169 |
### Framework versions
|
adapter_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 101036698
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:44772c4af3abc75d6063ca37102b982b62d41ac5fad308cde51f7d47e39986ef
|
3 |
size 101036698
|