Update readme
Browse files
README.md
CHANGED
@@ -32,7 +32,7 @@ fall und Felsen vor dem Gebäude mit Blick auf den Fluss.
|
|
32 |
|
33 |
- **Developed by:** [Jotschi](https://huggingface.co/Jotschi)
|
34 |
- **License:** [Apache License](https://www.apache.org/licenses/LICENSE-2.0)
|
35 |
-
- **Finetuned from model
|
36 |
|
37 |
## Uses
|
38 |
|
@@ -52,7 +52,11 @@ The model was trained using PEFT 4Bit Q-LoRA with the following parameters:
|
|
52 |
|
53 |
* rank: 256
|
54 |
* alpha: 16
|
55 |
-
*
|
|
|
|
|
|
|
|
|
56 |
* batch size: 4
|
57 |
* Input sequence length: 512
|
58 |
* Learning Rate: 2.0e-5
|
|
|
32 |
|
33 |
- **Developed by:** [Jotschi](https://huggingface.co/Jotschi)
|
34 |
- **License:** [Apache License](https://www.apache.org/licenses/LICENSE-2.0)
|
35 |
+
- **Finetuned from model:** [Mistral7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
|
36 |
|
37 |
## Uses
|
38 |
|
|
|
52 |
|
53 |
* rank: 256
|
54 |
* alpha: 16
|
55 |
+
* steps: 8500
|
56 |
+
* bf16: True
|
57 |
+
* lr_scheduler_type: cosine
|
58 |
+
* warmup_ratio: 0.03
|
59 |
+
* gradient accumulation steps: 2
|
60 |
* batch size: 4
|
61 |
* Input sequence length: 512
|
62 |
* Learning Rate: 2.0e-5
|