Update README.md
Browse files
README.md
CHANGED
@@ -44,6 +44,21 @@ ollama pull Tohur/natsumura-storytelling-rp-llama-3.1
|
|
44 |
- tdh87/Just-stories
|
45 |
- tdh87/Just-stories-2
|
46 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
47 |
## Inference
|
48 |
|
49 |
I use the following settings for inference:
|
|
|
44 |
- tdh87/Just-stories
|
45 |
- tdh87/Just-stories-2
|
46 |
|
47 |
+
The following parameters were used in [Llama Factory](https://github.com/hiyouga/LLaMA-Factory) during training:
|
48 |
+
- per_device_train_batch_size=2
|
49 |
+
- gradient_accumulation_steps=4
|
50 |
+
- lr_scheduler_type="cosine"
|
51 |
+
- logging_steps=10
|
52 |
+
- warmup_ratio=0.1
|
53 |
+
- save_steps=1000
|
54 |
+
- learning_rate=2e-5
|
55 |
+
- num_train_epochs=3.0
|
56 |
+
- max_samples=500
|
57 |
+
- max_grad_norm=1.0
|
58 |
+
- quantization_bit=4
|
59 |
+
- loraplus_lr_ratio=16.0
|
60 |
+
- fp16=True
|
61 |
+
|
62 |
## Inference
|
63 |
|
64 |
I use the following settings for inference:
|