Context Preset
What context preset should be used with this model? Does it have a custom one? I tried the LLama 3 context present and the model goes crazy. The information in the readme just makes it produce gibberish and go crazy.
Hi, I'm not quite sure what you mean by the context preset. If you're referring to the prompt format, then this model does use a custom prompt format that is different from the Llama 3 Instruct prompt format. The prompt format in the readme file should work generally speaking; you can also use the tokenizer.apply_chat_template
method since the chat template has also been set in the "tokenizer_config.json" file.
If you're getting nonsensical outputs, it could be an issue with the temperature setting. Are you using a temperature of 1 or some other value? I recommend lowering the temperature to something like 0.7 for this model, since it can sometimes generate nonsensical outputs at higher temperatures.
I'm using the prompt format from the readme, with a temperature of 0.7 in sillytavern. The bot frequently talks in non-sensical ways, completely forgets who it is, or goes on wild tangents completely unrelated to anything said. Because the custom prompt doesn't include anything about the bots description, or scenario it's also not able to effectively roleplay and sillytavern pops up windows saying, "prompt format missing scenario, prompt format missing persona, prompt format missing description".
I have temp 0.7, top k 0, top p 1, typical p 1, min p 0.02, top a 0, tfs 1, rep pen 1.05, rep pen range 512, rep pen slope 0.7, and everything else 0 except DRY is 0.8, 1.75, and 2 which is the settings I found for llama 3.1 and can make it give a couple coherent responses sometimes.
It will write a long creative story if asked, but it seems extremely difficult to make it roleplay anything.
Hmm, something is definitely wrong then. If you are using SillyTavern, the easiest way to use the model with the correct settings would be to use the Chat Completion API, with a Custom (OpenAI-compatible) endpoint. You can then either connect to the Aion Labs API or host the model locally using Ollama for example. The correct prompt format and other settings will be applied automatically, so nothing more needs to be done. I have just tested this method in SillyTavern (while connecting to the Aion Labs API, which is free) and the model worked properly. I am not that familiar with SillyTavern, so I'm not sure how you were setting up the model previously, but let me know if you still have issues getting it to work with the method I mentioned.