General discussion and feedback.
Feel free to share or ask anything.
Based on your experience and expertise, are you familiar with any models, either from your own work or from other members, that can effectively process an article of approximately 3000 words, while accurately applying prompts and retaining all the intricate details of the content?
Heya, cheers!
experience and expertise
Not really, you flatter me, but I'm just a horny chat roleplayer and gamer.
Given the size of your input I would tell you to test Kunocchini-7B-128k-test (v2 quant files here), though I only ever used it for roleplay.
Maybe @Test157t knows of something but can't say that's an use case I ever had to deal with.
Most of the mistral 7b's should handle rag style vectorization through st fine at 3000 words.
@Test157t, could you please explain a little more about rag style vectorization?
And among the mistral based models, what's your specific suggestion for this purpose?
For the model, this looks solid:
NousResearch/Nous-Hermes-2-Mistral-7B-DPO
As a mistral based, trained on 32K context.
There are a few GGUF quants for it you can already try.
@Test157t - Looking spicy:
l3utterfly/mistral-7b-v0.1-layla-v4
Vectorize file inputs with st, then use RAG (retrieval augmented generation) to pull information that matches queries relating to the vectorized information. [This is built into st] but can be done other ways as well.
Specific models depend on what the use case is.
@Lewdiculous looks interesting, will consider using in a side merge. However i primarily use models centered around chatml as it is the format i typically use.
Oh yeah, just found it interesting, at least for experiments. Your Prima Swag scores are pretty wild.
Yea its getting to the point where further improvements are going to be much more difficult to pull off. but alas im gonna keep smashing rocks. Also lelantacles should work great for rag. Havent tested v6 with it yet but v5 was very good considering its not trained to any capacity to use it.