neph1 commited on
Commit
e74ee85
·
verified ·
1 Parent(s): c3ab79f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -21
README.md CHANGED
@@ -42,27 +42,6 @@ The base model is pretty good at Swedish already, but my 'vibe check' says this
42
 
43
  Bellman is trained on fairly short answers and tends to be less verbose.
44
 
45
- ### Training Parameters
46
- per_device_train_batch_size = 2,<br>
47
- gradient_accumulation_steps = 64,<br>
48
- num_train_epochs=3,<br>
49
- warmup_steps = 5,<br>
50
- learning_rate = 1e-4,<br>
51
- logging_steps = 15,<br>
52
- optim = "adamw_8bit",<br>
53
- weight_decay = 0.01,<br>
54
- lr_scheduler_type = "linear",<br>
55
- seed = 3407,<br>
56
- per_device_eval_batch_size = 2,<br>
57
- evaluation_strategy="steps",<br>
58
- eval_accumulation_steps = 64,<br>
59
- eval_steps = 15,<br>
60
- eval_delay = 0,<br>
61
- save_strategy="steps",<br>
62
- save_steps=50,<br>
63
-
64
- ### Model Description
65
-
66
  Output example (Mistral-Nemo-Instruct-bellman-12b.i1-Q4_K_M.gguf):
67
 
68
  User: Hej!
@@ -85,6 +64,27 @@ Output example (Mistral-Nemo-Instruct-bellman-12b.i1-Q4_K_M.gguf):
85
 
86
  Bellman: Gustav Vasa levde från år 1496 till 1560.
87
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88
 
89
  - **Developed by:** Me
90
  - **Funded by:** Me
 
42
 
43
  Bellman is trained on fairly short answers and tends to be less verbose.
44
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
  Output example (Mistral-Nemo-Instruct-bellman-12b.i1-Q4_K_M.gguf):
46
 
47
  User: Hej!
 
64
 
65
  Bellman: Gustav Vasa levde från år 1496 till 1560.
66
 
67
+ ### Training Parameters
68
+ per_device_train_batch_size = 2,<br>
69
+ gradient_accumulation_steps = 64,<br>
70
+ num_train_epochs=3,<br>
71
+ warmup_steps = 5,<br>
72
+ learning_rate = 1e-4,<br>
73
+ logging_steps = 15,<br>
74
+ optim = "adamw_8bit",<br>
75
+ weight_decay = 0.01,<br>
76
+ lr_scheduler_type = "linear",<br>
77
+ seed = 3407,<br>
78
+ per_device_eval_batch_size = 2,<br>
79
+ evaluation_strategy="steps",<br>
80
+ eval_accumulation_steps = 64,<br>
81
+ eval_steps = 15,<br>
82
+ eval_delay = 0,<br>
83
+ save_strategy="steps",<br>
84
+ save_steps=50,<br>
85
+
86
+ ### Model Description
87
+
88
 
89
  - **Developed by:** Me
90
  - **Funded by:** Me