Quant-Cartel
/

SorcererLM-8x22b-exl2-longcal

Text Generation

Model card Files Files and versions Community

rAIfle commited on Sep 9, 2024

Commit

6a6688b

·

verified ·

1 Parent(s): 30ff343

Create README.md

Files changed (1) hide show

README.md +70 -0

README.md ADDED Viewed

	@@ -0,0 +1,70 @@

+---
+license: apache-2.0
+base_model: rAIfle/SorcererLM-8x22b-bf16
+quantized_by: Quant-Cartel
+---
+```
+  e88 88e                               d8
+ d888 888b  8888 8888  ,"Y88b 888 8e   d88
+C8888 8888D 8888 8888 "8" 888 888 88b d88888
+ Y888 888P  Y888 888P ,ee 888 888 888  888
+  "88 88"    "88 88"  "88 888 888 888  888
+      b
+      8b,
+  e88'Y88                  d8           888
+ d888  'Y  ,"Y88b 888,8,  d88    ,e e,  888
+C8888     "8" 888 888 "  d88888 d88 88b 888
+ Y888  ,d ,ee 888 888     888   888   , 888
+  "88,d88 "88 888 888     888    "YeeP" 888
+PROUDLY PRESENTS
+```
+# SorcererLM-8x22b-bf16-exl2-longcal
+Quantized using 115 rows of 8192 tokens from the default ExLlamav2-calibration dataset.
+Branches:
+- `main` -- `measurement.json`
+- `8b8h` -- 8bpw, 8bit lm_head
+- `6b6h` -- 6bpw, 6bit lm_head
+- `5b6h` -- 5bpw, 6bit lm_head
+- `4.5b6h` -- 4.5bpw, 6bit lm_head
+- `4b6h` -- 4bpw, 6bit lm_head
+- `3b6h` -- 3bpw, 6bit lm_head
+- `2.25b6h` -- 2.25bpw, 6bit lm_head
+Original model link: [rAIfle/SorcererLM-8x22b-bf16](https://huggingface.co/rAIfle/SorcererLM-8x22b-bf16)
+Original model README below.
+-----
+# SorcererLM-8x22b-bf16
+Oh boy, here we go. Low-rank (`r=16, alpha=32`) LoRA on top of [WizardLM-2-8x22B](https://huggingface.co/alpindale/WizardLM-2-8x22B), trained on 2 epochs of (cleaned & deduped) c2-logs. As far as I can tell, this is an upgrade from `WizardLM-2-8x22B` for RP purposes.
+Alongside this ready-to-use release there is also two alternatives released that are "mergebait", `rAIfle/sorcLM-epoch1-bf16` and `rAIfle/sorcLM-epoch2-bf16`. These are very experimental and likely not fit for use, but could be interesting for merging.
+## Why A LoRA?
+The choice was fully intentional. I briefly considered a FFT but for this particular use-case a LoRA seemed a better fit. `WizardLM-2-8x22B` is smart by itself but its used vocabulary leaves much to be desired when it comes to RP. By training a low-rank LoRA on top of it to teach it some of Claude's writing style, we remedy that.
+## Prompting
+- Use the templates in [Quant-Cartel/Recommended-Settings](https://huggingface.co/Quant-Cartel/Recommended-Settings) under the `SorcererLM`-folder.
+- Or Vicuna 1.1 and a sane context template. It's somewhat sensitive to samplers, I'd recommend Temperature 1, MinP 0.05 and a dash of DRY but YMMV. Shorter prompts seem to work better, too.
+## Quantized Versions
+- [iMat GGUFs](https://huggingface.co/Quant-Cartel/SorcererLM-8x22b-iMat-GGUF)
+- [longcal exl2s](https://huggingface.co/Quant-Cartel/SorcererLM-8x22b-bf16-exl2-longcal)
+## Acknowledgments
+The main shoutout I want to make is to my [Cartel](https://huggingface.co/Quant-Cartel) bros, [Envoid](https://huggingface.co/Envoid) and particularly [I^2](https://huggingface.co/InferenceIllusionist), for being amazing.
+## Training
+Trained using [qlora-pipe](https://github.com/tdrussell/qlora-pipe). Configs included in the `train`-subfolder.