File size: 13,564 Bytes
e87a881
 
 
 
98930a3
a93efec
05951ee
 
e87a881
 
fa3c1a5
9852f36
2d788de
dc6afc4
 
724ab99
dc6afc4
e87a881
2d788de
 
 
 
e87a881
c6b4e0b
 
00ef7d2
 
fa3c1a5
0476e53
 
fa3c1a5
 
e87a881
fa3c1a5
c2a953f
 
0476e53
2b2ea53
fa3c1a5
b399def
102ece0
0476e53
12aae94
0476e53
fdfa291
fa3c1a5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d3a2909
 
fa3c1a5
93b3aa9
 
8da8f4f
4750116
bd60924
 
 
 
 
 
 
 
 
3ed1266
bd60924
fa3c1a5
6856f9f
fa3c1a5
 
 
 
 
 
87cdce6
 
06b23e8
 
6856f9f
 
a267a6b
6856f9f
06b23e8
 
0476e53
fa3c1a5
06b23e8
 
fa3c1a5
06b23e8
fa3c1a5
 
 
 
06b23e8
fa3c1a5
06b23e8
fa3c1a5
 
 
 
06b23e8
fa3c1a5
 
06b23e8
fa3c1a5
06b23e8
fa3c1a5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e87a881
1b56eb0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e87a881
fa3c1a5
 
0476e53
34d487d
b92b02a
441a3af
cf9fae4
34d487d
0476e53
cf9fae4
 
0476e53
 
16b553e
0476e53
cf9fae4
d2b8afa
9bfbe77
d2b8afa
 
 
 
 
 
 
 
aa3afd8
9bfbe77
 
d2b8afa
 
 
fa3c1a5
0476e53
e87a881
 
 
 
 
 
 
 
 
fa3c1a5
2e8cdc5
fa3c1a5
 
 
 
 
a098f3a
fa3c1a5
 
 
 
 
0476e53
fa3c1a5
 
 
 
 
 
 
 
 
98930a3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
---
license: apache-2.0
language:
- en
pipeline_tag: text-generation
base_model: mistralai/Mistral-7B-Instruct-v0.2
tags:
- not-for-all-audiences
---

# ShoriRP 🏆
LIMA-like (less than 1000 training samples) roleplaying chat model based on data from:

- Two subject-specific RP forums;
- Synthetically-crafted conversations from [Limamono](https://huggingface.co/lemonilia/Limamono-Mistral-7B-v0.50);
- Some background lore and character descriptions (thus far mainly pertaining to Limamono);
- Tiny amount of RP-like instructions/alignment data.

An important difference from LimaRP, other than the subject focus, is that conversations are multi-character
where applicable, wheras LimaRP only included 1-on-1 RP. Furthermore, the messages sampled have shorter length
in general. The rationale behind this was that the short(er)-form roleplays are more "fun" on average, while
the longer ones tend to use common purple prose tropes and be a bit dull.

**This is still a work in progress. Updates will be posted in the future.**

---

# Technical details
- The prose of the training data has a consistent novel-like format with narration in third person and past tense.
- OOC was intentionally _not_ completely eliminated, and isolated into a special role. Likewise, URLs have not been all deleted unless they referred to internal forum resources.
- For a very small portion of the data, dialogue lines and thoughts, suitable emoji (mostly 1, up to 3) conveying the mood have been _prepended_. _Prepending_ instead of _appending_ helps the model and the reader to prepare for the message tone.
- Usernames have been entirely removed; only character names remained in the data (same policy as with LimaRP).

# Known issues
- The model is very horny, but this can be toned down with an appropriate system instruction.
- There are some repetition issues. This could be due to the base model used.
- Occasionally at the beginning of the chat (first message) there might be impersonation issues.
- There might be some residual "alignment" from the base model.

# Suggested starting text generation settings
- **Main choice** (may have repetition issues)
   - **Temperature**: 1.0; **Min-P**: 0.05-0.10; **Presence Penalty**: 0.35-0.45 
- **Alternative** (appears to solve repetition issues while being coherent, but reponses might possibly be less truthful)
   - **Temperature**: 2.40-2.50; **Min-P**: 0.40; **Frequency penalty**: 0.10-0.15; Temperature last.

# Prose format
All training samples use book (novel) format with narration in third person / past tense. Other formats are not supported (they might work, but not consistently).

## Details
- Character thoughts are delimited with underscores `_`.
- Onomatopoeias are delimited with single asterisks `*`.
- Emphasized text is delimited by double asterisks `**`.
- Spoken dialogues are delimited with ASCII quote marks `"`.
- Non-dialogue quotes are replaced with double apostrophes `''`. This avoids distracting and/or annoying conflicts with the dialogue highlighting in SillyTavern.
- Text to be interpreted as louder than normal is in `ALL CAPS`.
- Quoted text from other people is most of the time prepended with `>`.
- Formatted output text is delimited with triple backticks ` ``` `, sometimes followed by additional identifiers specifying the language (markdown, text, etc).

# Prompting format
Suitable `json` files have been provided to easily apply the prompting format in SillyTavern.

- [Context](https://huggingface.co/lemonilia/ShoriRP-v0.68/resolve/main/BlockML-Context.json?download=true)
- [Instruct](https://huggingface.co/lemonilia/ShoriRP-v0.68/resolve/main/BlockML-Instruct.json?download=true)

Note: the prompting format is **intentionally different** from that of the Mistral-Instruct base model.

It is advised to use `▄` as a stop token.

## Reverse jailbreak
Since the model is normally very wiling to initiate NSFW scenarios even when inappropriate, a "reverse jailbreak"
has been added in the Instruct preset linked above:

```
[INST] Write a safe conversation suitable for all audiences. Don't be vulgar or sexually explicit. [/INST]
```

Placed as a system instruction, this has only the effect of _toning down_ the model's default horniness and won't actually prevent
NSFW content. If desired, it can be removed.

## Block characters
The model uses a _ChatML-like_ prompting format with a few changes from the usual roles typically used for ChatGPT-like assistant chatbots. The main one is that `<|im_start|>` has been replaced with `▀` (upper half block character) and `<|im_end|>` has been replaced with `▄` (lower half block character).

Both of these tokens already exist in the Mistral tokenizer as single tokens; they don't have any combination with other tokens, nor any special meaning attached to them, so for all intents and purposes they work like special tokens.

This avoids complications related with training a model with new tokens, as well tokenization issues that occur with ChatML tokens when used literally.

## Roles
All roles except `message` are optional.

Role       | Description
-----------|------------
title      | The title of the roleplay. It's used for steering the conversation at the beginning. Generally it's the first block in the RP conversations, but it can occur mid-conversation when the scene changes.
tags       | A list of comma-separated relevant tags to hint the model about chat contents. If added, it should be placed after the title.
lore       | Extended background or character lore/story is to be placed under the `lore` role.
scenario   | Future events that must still happen go in `scenario`. This is also used for steering the contents of the conversation at the beginning.
description| This is where character cards go. No specific layout for character profiles is defined, but the name of the character should be clear from the description. In the training data, profiles may occasionally appear mid-conversation (for example when a new character appears). Try to use one `description` block per character.
message    | [**Mandatory**] Messages are all under the `message` role regardless of who writes it. The rationale for this is that since conversations are multi-character and the characters do not necessarily reply in a fixed order, it won't be possible to reliably establish who is the "human" in terms of training. `message` was found to be neutral enough as a role and a better fit, considering the length hints that can be added.
ooc        | A dedicated communication channel where OOC talk has ben confined, but it's unclear how this could be actually used in existing LLM front-ends.

### Message length hints
Like LimaRP, messages use optional **length hints**. It's recommended to add them, otherwise the model may output very short messages. _It is however still possible to use the model without them for a more dynamic and fast roleplaying experience._

The available lengths are: `nano`, `micro`, `tiny`, `short`, `medium`, `long`, `massive`, `huge`, `enormous`  The recommended length is _medium_. The longest sizes do not have a large amount of training data, so they might not work very reliably. Refer to the prompting examples below as for how to add length hints.

## Example prompt template
```text
▀title
{story title}▄
▀tags
{comma-separated list of tags}▄
▀lore
{{loreBefore}}▄
▀description
{{char}}
{{description}}▄
▀description
{{user}}
{{persona}}▄
▀scenario
{{scenario}}▄
▀message (length: {length})
{{char}}: {message}▄
▀message (length: {length})
{{user}}: {message}▄
▀message (length: {length})
{{char}}: {message}▄

[...]
```

## Practical example
```
▀title
A strange incident in Gensokyo▄
▀tags
barrier, danmaku, magic, reimu, marisa▄
▀description
**Name:** Reimu Hakurei
**Age:** 18
**Personality:** Calm and collected. She is a very responsible person and tries to do her job as well as she can. She also likes to take care of people around her, even if they are not always nice to her.
**Appearance:** Reimu is a young girl with long, black hair and brown eyes. She wears a red ribbon and matching tubes on her sidelocks and a traditional shrine maiden uniform, with a red hakama over a white kimono.
**Background:** Reimu is the shrine maiden of Hakurei Shrine, located in the center of Gensokyo. She spends most of her time taking care of the shrine and performing various duties for the residents of the village. She is known to be quite skilled in the use of magic, especially when it comes to barrier magic.▄
▀description
**Name:** Marisa Kirisame
**Personality:** Impulsive and energetic. She is often seen as a troublemaker by others due to her tendency to break rules and cause chaos wherever she goes. She is also a bit of a flirt and enjoys teasing others.▄
▀message (length: medium)
Reimu: "Hmm... I wonder what's going on?" Reimu mused as she stood at the entrance to the shrine, looking out at the village beyond. It was unusually quiet today, with no one coming to visit or offer any kind of offering. She had been expecting a few visitors this morning, but none had shown up yet.

"Maybe everyone is busy with something else today? Or maybe they're all sick?" she thought as she turned back inside, closing the door behind her. She began tidying up the shrine, making sure everything was clean and ready for visitors. As she worked, she couldn't shake the feeling that something wasn't right.▄
▀message (length: short)
Marisa: "Ooohh! Reimu-chan~!" Marisa suddenly appeared from nowhere, landing on the ground with a soft thud. "What's wrong? Why aren't there any customers today? Aren't you supposed to have lots of visitors every day? I thought you were famous for being able to heal injuries and cure diseases..."

She gave her friend a wink before continuing, "But I guess I could always come by and give you some company! I'm bored anyway~"▄
▀message (length: long)
Reimu: _Ugh, that girl again..._ Reimu thought as she looked at Marisa with annoyance. The younger girl was known for causing mischief wherever she went, and Reimu didn't appreciate her interrupting her work.

"I don't know, Marisa," she replied curtly. "No one seems to be coming today. Maybe they're all busy with their own things. But thank you for offering your help."

Reimu continued cleaning the shrine while keeping an eye on Marisa. She knew that if she left the girl alone for too long, she would probably start causing trouble. She just hoped that nothing bad happened today.▄
```

## Mixing Mistral-Instruct and ShoriRP prompt formats together
It is also possible to simultaneously use, with very good results in chat steerability, the instruction prompting
format of the base model Mistral-Instruct with the one of ShoriRP.

An `[INST] ... [/INST]` block can be either used as a "system instruction" on the top of the conversation, or
inserted between one message block and the other as if it was an "author note", as seen in this example (chat history
and contents omitted for brevity):

```
▀message
Chen: [...]▄
[INST] Yukari's personality: proud, haughty [/INST]
▀message
Yukari: [...]▄
```

# Dataset
Similar to LimaRP, but more niche. Flexible training sample length (from 4k to 32k tokens, at least). Might or might not be released in the future.

The model is trained in several consecutive steps with decreasing learning rate and increasing data
quality/focus. While it is unknown whether having separate low- and
mid-tier categories helps, the higher tiers are needed for the model to focus mainly on the prose and
format of the higher-quality data. This also makes retraining quicker if it only involves changes in that data.

In general, training higher quality data last increases its weight in the outputs.

| Category | Description
|:--:|---
|Low | Short or very short-form RP conversations (often composed of one-liners); prose quality not always good.
|Mid | Mid-range and longer-form RP conversations that do not always meet the required quality standards or target prose format + Some lore data and character descriptions.
|High| Longer-form RP conversations of target prose quality.
|Top | Synthetic data from Limamono + Some alignment and RP-like instruction data.

## Stats
From my data building script:

```text
Total conversations: 461
User message count: 29,788 messages
Total unique tokens: 4,473,615 tokens
Longest conversation: 16,372 tokens
```

- Size of the training data: 17.2 MB (about 40% larger than the first LimaRP release)
- The user message count doesn't include descriptions and other metadata.
- The actual number of conversations is higher than what the above figure suggests, since many are split into several sub-conversations.

### Message length distribution
Most user messages are below 300 tokens in length.

![Message length distribution](https://files.catbox.moe/yxdgop.png)

# Training details
## Hardware
1x NVidia RTX 3090 24GB

## Software
[Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl)

## Training hyperparameters
```yaml
base_model: /home/anon/AI-Models/LLM/Mistral-7B-Instruct-v0.2
load_in_4bit: true
adapter: qlora
sequence_len: 16384
sample_packing: true
pad_to_sequence_len: false
gradient_accumulation_steps: 2
micro_batch_size: 1
eval_batch_size: 1
num_epochs: 2
optimizer: adamw_bnb_8bit
lr_scheduler: constant
learning_rate: 0.0000725 -> 0.0000550 -> 0.0000375 -> 0.0000350
weight_decay: 0.05
train_on_inputs: true
bf16: true
fp16: false
tf32: true
lora_r: 20
lora_alpha: 16
lora_dropout: 0.1
lora_target_linear: true
```