Update README.md
Browse files
README.md
CHANGED
@@ -16,8 +16,10 @@ library_name: transformers
|
|
16 |
|
17 |
Chronos Gold 12B 1.0 is a very unique model that applies to domain areas such as
|
18 |
geneal chatbot functionatliy, *roleplay*, and storywriting. The model has been observed to write up to 2250 tokens in a single sequence. The model was trained at a
|
19 |
-
sequence length of 16384 (16k) and will still retain the *apparent* 128k context length from Mistral-Nemo, though it deteriorates over time like regular Nemo does
|
20 |
-
|
|
|
|
|
21 |
|
22 |
The base model is `mistralai/Mistral-Nemo-Base-2407` which was heavily modified to produce a more coherent model, comparable to much larger models.
|
23 |
|
|
|
16 |
|
17 |
Chronos Gold 12B 1.0 is a very unique model that applies to domain areas such as
|
18 |
geneal chatbot functionatliy, *roleplay*, and storywriting. The model has been observed to write up to 2250 tokens in a single sequence. The model was trained at a
|
19 |
+
sequence length of 16384 (16k) and will still retain the *apparent* 128k context length from Mistral-Nemo, though it deteriorates over time like regular Nemo does based on
|
20 |
+
the [RULER Test](https://github.com/hsiehjackson/RULER?tab=readme-ov-file#-ruler-whats-the-real-context-size-of-your-long-context-language-models)
|
21 |
+
|
22 |
+
As a result, is recommended to keep your sequence length max at 16384, or you will experience performance degredation.
|
23 |
|
24 |
The base model is `mistralai/Mistral-Nemo-Base-2407` which was heavily modified to produce a more coherent model, comparable to much larger models.
|
25 |
|