nbeerbower/Mistral-Nemo-Prism-12B-v7 · So... is this the final one?

Nov 13, 2024

Hi~ Prodeus Unity here! Don't mean to be a bother, I'm just checking if this is the official last version, and, which of the versions is the best out of v1 - v7? I made a merge earlier, I want to ensure it stays within the highest quality model.

nbeerbower

Owner Nov 13, 2024

Hey! Sorry for the chaos... these ended being experimental models and IMO they all failed the main goal (I want these damn models to stop saying "ministrations"). So I can't really say which one is the "best". But yeah, this is the last one for QLoRA tunes. I will try again with a full finetune but I might name the model something else.

ProdeusUnity

Nov 13, 2024

Ah damn, that sucks. I used v5 in the merge I did before, that was with Mag Mell. The Ministrations issue seems relatively difficult to stop in general, though I hear if you train it in the same domain it happens in, it can greatly help reduce it, I'm no expert though. v5 seems to be the middle spot for said merge too.

DazzlingXeno

Nov 13, 2024

I tried V2 and thought it was nice, especially for a 12b model. Definitely some potential there with those datasets!

nbeerbower

Owner Nov 13, 2024

Appreciate the feedback!

Markobes

8 days ago

I tried it in translating subtitles from English into another language. The result was impressive. However, the model suddenly froze and continued only after an additional request. By the way, the training was conducted on the corpus of English-language texts, as I understand it. It probably wouldn't be superfluous to add texts in other languages.
Unbabel/TowerInstruct-Mistral-7B came out rather clumsily. I would recommend conducting an experiment to merge their TowerInstruct-13B model with Mistral-Nemo in order to get something that could resemble a literary translation. Another option is to merge with Aya. However, I received non-existent words from her, which inspires caution. Everything went much better with Command-R.
Tnanks.