Update README.md
Browse files
README.md
CHANGED
@@ -15,13 +15,14 @@ library_name: transformers
|
|
15 |
![image/webp](https://cdn-uploads.huggingface.co/production/uploads/630417380907b9a115c6aa9f/3hc8zt8fzKdO3qHK1p1mW.webp)
|
16 |
|
17 |
Chronos Gold 12B 1.0 is a very unique model that applies to domains areas such as
|
18 |
-
geneal chatbot functionatliy, *roleplay*, and storywriting. The model has been observed to write up to 2250 tokens in a single sequence.
|
|
|
19 |
|
20 |
The base model is `mistralai/Mistral-Nemo-Base-2407` which was heavily modified to produce a more coherent model, comparable to much larger models.
|
21 |
|
22 |
**Chronos Gold 12B-1.0** re-creates the uniqueness of the original Chronos with significiantly enhanced prompt adherence (following), coherence, a modern dataset, as well as supporting a majority of "character card" formats in applications like SillyTavern.
|
23 |
|
24 |
-
It went through an
|
25 |
|
26 |
The specifics of the model will not be disclosed at the time due to dataset ownership.
|
27 |
|
|
|
15 |
![image/webp](https://cdn-uploads.huggingface.co/production/uploads/630417380907b9a115c6aa9f/3hc8zt8fzKdO3qHK1p1mW.webp)
|
16 |
|
17 |
Chronos Gold 12B 1.0 is a very unique model that applies to domains areas such as
|
18 |
+
geneal chatbot functionatliy, *roleplay*, and storywriting. The model has been observed to write up to 2250 tokens in a single sequence. The model was trained at a
|
19 |
+
sequence length of 16384 (16k) and will still retain the *apparent* 128k context length from Mistral-Nemo.
|
20 |
|
21 |
The base model is `mistralai/Mistral-Nemo-Base-2407` which was heavily modified to produce a more coherent model, comparable to much larger models.
|
22 |
|
23 |
**Chronos Gold 12B-1.0** re-creates the uniqueness of the original Chronos with significiantly enhanced prompt adherence (following), coherence, a modern dataset, as well as supporting a majority of "character card" formats in applications like SillyTavern.
|
24 |
|
25 |
+
It went through an iterative and objective merge process as my previous models and was further finetuned on a dataset curated for it.
|
26 |
|
27 |
The specifics of the model will not be disclosed at the time due to dataset ownership.
|
28 |
|