DavidAU
/

Gemma-The-Writer-DEADLINE-10B-GGUF

Model card Files Files and versions Community

DavidAU commited on Oct 21, 2024

Commit

e8348a4

verified ·

1 Parent(s): 30e3ea5

Update README.md

Browse files

Files changed (1) hide show

README.md +49 -3

README.md CHANGED Viewed

@@ -35,7 +35,7 @@ pipeline_tag: text-generation
 <h3>Gemma-The-Writer-DEADLINE-10B-GGUF</h3>
-<img src="the-writer.jpg" style="float:right; width:300px; height:300px; padding:10px;">
 This is a Gemma2 model merge of the top storytelling / writing models as noted at EQBench, tuned specifically for fiction, story, and writing.
@@ -46,14 +46,60 @@ This model requires GEMMA Instruct template, and has 8k context window but is ex
 This version - "Deadline" - is a modifed version of "Gemma The Writer 9B" ( [ https://huggingface.co/DavidAU/Gemma-The-Writer-9B-GGUF ] ) and has been modified with a
 Brainstorm 5x adapter to alter output generation.
-This adds close to 1B parameters to the model raising it to 46 layers, 510 tensors to a total of 10B parameters.
 The addition of Brainstorm has altered the prose, sentence structure, reduced GPTISMS, and generally improved the model's performance.
-Recommended Rep Pen of 1.02 or higher, temp range 0-5.
 Example outputs below.
 <B>Models Used:</b>
 This is a high precision "DARE TIES" merge at the layer level (each layer per model adjusted - 168 points of adjustment over the 4 models) comprised of these models:

 <h3>Gemma-The-Writer-DEADLINE-10B-GGUF</h3>
+<img src="gemma-deadline.jpg" style="float:right; width:300px; height:300px; padding:10px;">
 This is a Gemma2 model merge of the top storytelling / writing models as noted at EQBench, tuned specifically for fiction, story, and writing.
 This version - "Deadline" - is a modifed version of "Gemma The Writer 9B" ( [ https://huggingface.co/DavidAU/Gemma-The-Writer-9B-GGUF ] ) and has been modified with a
 Brainstorm 5x adapter to alter output generation.
+This adds close to 1B parameters to the model raising it to 46 layers, 508 tensors to a total of 10B parameters.
 The addition of Brainstorm has altered the prose, sentence structure, reduced GPTISMS, and generally improved the model's performance.
+It also raises the average output length - in some cases almost doubling it.
+Recommended Rep Pen of 1.02 or higher, temp range 0-5. (see other settings notes below)
 Example outputs below.
+<B>Settings, Quants and Critical Operations Notes:</b>
+This model has been modified ("Brainstorm") to alter prose output, and generally outputs longer text than average.
+Change in temp (ie, .4, .8, 1.5, 2, 3 ) will drastically alter output.
+Rep pen settings will also alter output too.
+This model needs "rep pen" of 1.02 or higher.
+For role play: Rep pen of 1.05 to 1.08 is suggested.
+Raise/lower rep pen SLOWLY ie: 1.011, 1.012 ...
+Rep pen will alter prose, word choice (lower rep pen=small words / more small word - sometimes) and creativity.
+To really push the model:
+Rep pen 1.05 or lower / Temp 3+ ... be ready to stop the output because it may go and go at these strong settings.
+You can also set a "hard stop" - maximum tokens generation - too to address lower rep pen settings / high creativity settings.
+Longer prompts vastly increase the quality of the model's output.
+QUANT CHOICE(S):
+Higher quants will have more detail, nuance and in some cases stronger "emotional" levels. Characters will also be
+more "fleshed out" too. Sense of "there" will also increase.
+Q4KM/Q4KS are good, strong quants however if you can run Q5, Q6 or Q8 - go for the highest quant you can.
+This repo also has 3 "ARM" quants for computers that support this quant. If you use these on a "non arm" machine token per second will be very low.
+IQ4XS: Due to the unusual nature of this quant (mixture/processing), generations from it will be different then other quants.
+You may want to try it / compare it to other quant(s) output.
+Special note on Q2k/Q3 quants:
+You may need to use temp 2 or lower with these quants (1 or lower for q2k). Just too much compression at this level, damaging the model. I will see if Imatrix versions
+of these quants will function better.
+Rep pen adjustments may also be required to get the most out of this model at this/these quant level(s).
 <B>Models Used:</b>
 This is a high precision "DARE TIES" merge at the layer level (each layer per model adjusted - 168 points of adjustment over the 4 models) comprised of these models: