Text Generation
GGUF
English
creative
creative writing
fiction writing
plot generation
sub-plot generation
story generation
scene continue
storytelling
fiction story
science fiction
romance
all genres
story
writing
vivid prosing
vivid writing
fiction
roleplaying
bfloat16
swearing
rp
horror
gemma
mergekit
Inference Endpoints
conversational
Update README.md
Browse files
README.md
CHANGED
@@ -35,7 +35,7 @@ pipeline_tag: text-generation
|
|
35 |
|
36 |
<h3>Gemma-The-Writer-DEADLINE-10B-GGUF</h3>
|
37 |
|
38 |
-
<img src="
|
39 |
|
40 |
This is a Gemma2 model merge of the top storytelling / writing models as noted at EQBench, tuned specifically for fiction, story, and writing.
|
41 |
|
@@ -46,14 +46,60 @@ This model requires GEMMA Instruct template, and has 8k context window but is ex
|
|
46 |
This version - "Deadline" - is a modifed version of "Gemma The Writer 9B" ( [ https://huggingface.co/DavidAU/Gemma-The-Writer-9B-GGUF ] ) and has been modified with a
|
47 |
Brainstorm 5x adapter to alter output generation.
|
48 |
|
49 |
-
This adds close to 1B parameters to the model raising it to 46 layers,
|
50 |
|
51 |
The addition of Brainstorm has altered the prose, sentence structure, reduced GPTISMS, and generally improved the model's performance.
|
52 |
|
53 |
-
|
|
|
|
|
54 |
|
55 |
Example outputs below.
|
56 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
57 |
<B>Models Used:</b>
|
58 |
|
59 |
This is a high precision "DARE TIES" merge at the layer level (each layer per model adjusted - 168 points of adjustment over the 4 models) comprised of these models:
|
|
|
35 |
|
36 |
<h3>Gemma-The-Writer-DEADLINE-10B-GGUF</h3>
|
37 |
|
38 |
+
<img src="gemma-deadline.jpg" style="float:right; width:300px; height:300px; padding:10px;">
|
39 |
|
40 |
This is a Gemma2 model merge of the top storytelling / writing models as noted at EQBench, tuned specifically for fiction, story, and writing.
|
41 |
|
|
|
46 |
This version - "Deadline" - is a modifed version of "Gemma The Writer 9B" ( [ https://huggingface.co/DavidAU/Gemma-The-Writer-9B-GGUF ] ) and has been modified with a
|
47 |
Brainstorm 5x adapter to alter output generation.
|
48 |
|
49 |
+
This adds close to 1B parameters to the model raising it to 46 layers, 508 tensors to a total of 10B parameters.
|
50 |
|
51 |
The addition of Brainstorm has altered the prose, sentence structure, reduced GPTISMS, and generally improved the model's performance.
|
52 |
|
53 |
+
It also raises the average output length - in some cases almost doubling it.
|
54 |
+
|
55 |
+
Recommended Rep Pen of 1.02 or higher, temp range 0-5. (see other settings notes below)
|
56 |
|
57 |
Example outputs below.
|
58 |
|
59 |
+
<B>Settings, Quants and Critical Operations Notes:</b>
|
60 |
+
|
61 |
+
This model has been modified ("Brainstorm") to alter prose output, and generally outputs longer text than average.
|
62 |
+
|
63 |
+
Change in temp (ie, .4, .8, 1.5, 2, 3 ) will drastically alter output.
|
64 |
+
|
65 |
+
Rep pen settings will also alter output too.
|
66 |
+
|
67 |
+
This model needs "rep pen" of 1.02 or higher.
|
68 |
+
|
69 |
+
For role play: Rep pen of 1.05 to 1.08 is suggested.
|
70 |
+
|
71 |
+
Raise/lower rep pen SLOWLY ie: 1.011, 1.012 ...
|
72 |
+
|
73 |
+
Rep pen will alter prose, word choice (lower rep pen=small words / more small word - sometimes) and creativity.
|
74 |
+
|
75 |
+
To really push the model:
|
76 |
+
|
77 |
+
Rep pen 1.05 or lower / Temp 3+ ... be ready to stop the output because it may go and go at these strong settings.
|
78 |
+
|
79 |
+
You can also set a "hard stop" - maximum tokens generation - too to address lower rep pen settings / high creativity settings.
|
80 |
+
|
81 |
+
Longer prompts vastly increase the quality of the model's output.
|
82 |
+
|
83 |
+
QUANT CHOICE(S):
|
84 |
+
|
85 |
+
Higher quants will have more detail, nuance and in some cases stronger "emotional" levels. Characters will also be
|
86 |
+
more "fleshed out" too. Sense of "there" will also increase.
|
87 |
+
|
88 |
+
Q4KM/Q4KS are good, strong quants however if you can run Q5, Q6 or Q8 - go for the highest quant you can.
|
89 |
+
|
90 |
+
This repo also has 3 "ARM" quants for computers that support this quant. If you use these on a "non arm" machine token per second will be very low.
|
91 |
+
|
92 |
+
IQ4XS: Due to the unusual nature of this quant (mixture/processing), generations from it will be different then other quants.
|
93 |
+
|
94 |
+
You may want to try it / compare it to other quant(s) output.
|
95 |
+
|
96 |
+
Special note on Q2k/Q3 quants:
|
97 |
+
|
98 |
+
You may need to use temp 2 or lower with these quants (1 or lower for q2k). Just too much compression at this level, damaging the model. I will see if Imatrix versions
|
99 |
+
of these quants will function better.
|
100 |
+
|
101 |
+
Rep pen adjustments may also be required to get the most out of this model at this/these quant level(s).
|
102 |
+
|
103 |
<B>Models Used:</b>
|
104 |
|
105 |
This is a high precision "DARE TIES" merge at the layer level (each layer per model adjusted - 168 points of adjustment over the 4 models) comprised of these models:
|