bartowski
/

miniclaus-qw1.5B-UNAMGS-GGUF

Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

fblgit commited on Nov 15, 2024

Commit

32a84ba

·

verified ·

1 Parent(s): 2b27757

Update README.md

Files changed (1) hide show

README.md +52 -0

README.md CHANGED Viewed

@@ -16,6 +16,58 @@ model-index:
   results: []
 ---
 ## Llamacpp imatrix Quantizations of miniclaus-qw1.5B-UNAMGS
 Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b4058">b4058</a> for quantization.

   results: []
 ---
+# miniclaus-qw1.5B-UNAMGS
+Trained with `Magpie-Align/Magpie-Pro-MT-300K-v0.1`
+Using MGS & UNA (MLP) on this tiny but powerful model.
+![miniclaus-qw1.5B-UNAMGS](https://huggingface.co/fblgit/miniclaus-qw1.5B-UNAMGS/resolve/main/miniclaus_qw15-UNAMGS.png)
+[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
+It achieves the following results on the evaluation set:
+- Loss: 0.7193
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- train_batch_size: 1
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 8
+- total_train_batch_size: 128
+- total_eval_batch_size: 8
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 1.1641        | 0.0007 | 1    | 0.8514          |
+| 0.9246        | 0.0503 | 76   | 0.7921          |
+| 0.8791        | 0.1006 | 152  | 0.7727          |
+| 0.8507        | 0.1509 | 228  | 0.7611          |
+| 0.8376        | 0.2012 | 304  | 0.7534          |
+| 0.793         | 0.2515 | 380  | 0.7467          |
+| 0.7834        | 0.3018 | 456  | 0.7421          |
+| 0.7807        | 0.3521 | 532  | 0.7384          |
+| 0.764         | 0.4023 | 608  | 0.7359          |
+| 0.7738        | 0.4526 | 684  | 0.7320          |
+| 0.7425        | 0.5029 | 760  | 0.7300          |
+| 0.7519        | 0.5532 | 836  | 0.7279          |
+| 0.7461        | 0.6035 | 912  | 0.7255          |
+| 0.7489        | 0.6538 | 988  | 0.7245          |
+| 0.7614        | 0.7041 | 1064 | 0.7222          |
+| 0.7576        | 0.7544 | 1140 | 0.7222          |
+| 0.7303        | 0.8047 | 1216 | 0.7209          |
+| 0.7332        | 0.8550 | 1292 | 0.7199          |
+| 0.7541        | 0.9053 | 1368 | 0.7202          |
+| 0.7369        | 0.9556 | 1444 | 0.7193          |
 ## Llamacpp imatrix Quantizations of miniclaus-qw1.5B-UNAMGS
 Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b4058">b4058</a> for quantization.