rAIfle commited on
Commit
6a6688b
·
verified ·
1 Parent(s): 30ff343

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -0
README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: rAIfle/SorcererLM-8x22b-bf16
4
+ quantized_by: Quant-Cartel
5
+ ---
6
+ ```
7
+ e88 88e d8
8
+ d888 888b 8888 8888 ,"Y88b 888 8e d88
9
+ C8888 8888D 8888 8888 "8" 888 888 88b d88888
10
+ Y888 888P Y888 888P ,ee 888 888 888 888
11
+ "88 88" "88 88" "88 888 888 888 888
12
+ b
13
+ 8b,
14
+
15
+ e88'Y88 d8 888
16
+ d888 'Y ,"Y88b 888,8, d88 ,e e, 888
17
+ C8888 "8" 888 888 " d88888 d88 88b 888
18
+ Y888 ,d ,ee 888 888 888 888 , 888
19
+ "88,d88 "88 888 888 888 "YeeP" 888
20
+
21
+ PROUDLY PRESENTS
22
+ ```
23
+ # SorcererLM-8x22b-bf16-exl2-longcal
24
+
25
+ Quantized using 115 rows of 8192 tokens from the default ExLlamav2-calibration dataset.
26
+
27
+ Branches:
28
+ - `main` -- `measurement.json`
29
+ - `8b8h` -- 8bpw, 8bit lm_head
30
+ - `6b6h` -- 6bpw, 6bit lm_head
31
+ - `5b6h` -- 5bpw, 6bit lm_head
32
+ - `4.5b6h` -- 4.5bpw, 6bit lm_head
33
+ - `4b6h` -- 4bpw, 6bit lm_head
34
+ - `3b6h` -- 3bpw, 6bit lm_head
35
+ - `2.25b6h` -- 2.25bpw, 6bit lm_head
36
+
37
+ Original model link: [rAIfle/SorcererLM-8x22b-bf16](https://huggingface.co/rAIfle/SorcererLM-8x22b-bf16)
38
+
39
+ Original model README below.
40
+
41
+ -----
42
+
43
+ # SorcererLM-8x22b-bf16
44
+
45
+ Oh boy, here we go. Low-rank (`r=16, alpha=32`) LoRA on top of [WizardLM-2-8x22B](https://huggingface.co/alpindale/WizardLM-2-8x22B), trained on 2 epochs of (cleaned & deduped) c2-logs. As far as I can tell, this is an upgrade from `WizardLM-2-8x22B` for RP purposes.
46
+
47
+ Alongside this ready-to-use release there is also two alternatives released that are "mergebait", `rAIfle/sorcLM-epoch1-bf16` and `rAIfle/sorcLM-epoch2-bf16`. These are very experimental and likely not fit for use, but could be interesting for merging.
48
+
49
+ ## Why A LoRA?
50
+
51
+ The choice was fully intentional. I briefly considered a FFT but for this particular use-case a LoRA seemed a better fit. `WizardLM-2-8x22B` is smart by itself but its used vocabulary leaves much to be desired when it comes to RP. By training a low-rank LoRA on top of it to teach it some of Claude's writing style, we remedy that.
52
+
53
+ ## Prompting
54
+
55
+ - Use the templates in [Quant-Cartel/Recommended-Settings](https://huggingface.co/Quant-Cartel/Recommended-Settings) under the `SorcererLM`-folder.
56
+ - Or Vicuna 1.1 and a sane context template. It's somewhat sensitive to samplers, I'd recommend Temperature 1, MinP 0.05 and a dash of DRY but YMMV. Shorter prompts seem to work better, too.
57
+
58
+ ## Quantized Versions
59
+
60
+ - [iMat GGUFs](https://huggingface.co/Quant-Cartel/SorcererLM-8x22b-iMat-GGUF)
61
+ - [longcal exl2s](https://huggingface.co/Quant-Cartel/SorcererLM-8x22b-bf16-exl2-longcal)
62
+
63
+ ## Acknowledgments
64
+
65
+ The main shoutout I want to make is to my [Cartel](https://huggingface.co/Quant-Cartel) bros, [Envoid](https://huggingface.co/Envoid) and particularly [I^2](https://huggingface.co/InferenceIllusionist), for being amazing.
66
+
67
+
68
+ ## Training
69
+
70
+ Trained using [qlora-pipe](https://github.com/tdrussell/qlora-pipe). Configs included in the `train`-subfolder.