Weyaxi
/

EulerMath-Mistral-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Weyaxi commited on Apr 7, 2024

Commit

e4bec07

·

verified ·

1 Parent(s): c2b97ce

add things I already know pre model card 2

Files changed (1) hide show

README.md +75 -1

README.md CHANGED Viewed

@@ -32,7 +32,81 @@ This model's training was sponsored by [sablo.ai](https://sablo.ai).
 axolotl version: `0.4.0`
 ```yaml
 ```
 </details><br>
@@ -75,7 +149,7 @@ Quantizationed versions of this model is currently not available. It will be ava
 This model is full fine-tuned for 2 epoch.
-Total number of steps was x.
 <details><summary>Loss graph</summary>

 axolotl version: `0.4.0`
 ```yaml
+base_model: meta-math/MetaMath-Mistral-7B
+model_type: MistralForCausalLM
+tokenizer_type: LlamaTokenizer
+is_mistral_derived_model: true
+load_in_8bit: false
+load_in_4bit: false
+strict: false
+chat_template: alpaca
+datasets:
+  - path: microsoft/orca-math-word-problems-200k
+    type: alpaca_chat.load_qa
+    conversation: alpaca
+  - path: TIGER-Lab/MathInstruct
+    type: alpaca
+    conversation: alpaca
+dataset_prepared_path: last_run_prepared
+val_set_size: 0.005
+#val_set_size: 0.0
+output_dir: ./EulerMath-Mistral-7B-model
+sequence_len: 8192
+sample_packing: true
+pad_to_sequence_len: true
+eval_sample_packing: false
+wandb_project: Euler
+wandb_entity:
+wandb_watch:
+wandb_name:
+wandb_log_model:
+hub_model_id: Weyaxi/EulerMath-Mistral-7B
+save_safetensors: true
+gradient_accumulation_steps: 4
+micro_batch_size: 2 # changed
+num_epochs: 2
+optimizer: adamw_bnb_8bit
+lr_scheduler: cosine
+learning_rate: 0.000005
+train_on_inputs: false
+group_by_length: false
+bf16: true
+fp16: false
+tf32: false
+gradient_checkpointing: true
+early_stopping_patience:
+resume_from_checkpoint:
+local_rank:
+logging_steps: 1
+xformers_attention:
+flash_attention: true
+warmup_steps: 10
+evals_per_epoch: 4 # changed
+eval_table_size:
+eval_table_max_new_tokens: 128
+saves_per_epoch: 1 # changed
+debug:
+deepspeed: zero3_bf16.json
+weight_decay: 0.0
+fsdp:
+fsdp_config:
+special_tokens:
+  bos_token: "<s>"
+  eos_token: "</s>"
+  unk_token: "<unk>"
 ```
 </details><br>
 This model is full fine-tuned for 2 epoch.
+Total number of steps was 544.
 <details><summary>Loss graph</summary>