RozGrov
/

NemoDori-v0.2.2-12B-MN-ties

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

NemoDori-v0.2.2-12B-MN-ties / README.md

RozGrov's picture

Update README.md

6f4a83c verified 6 months ago

|

history blame contribute delete

2.87 kB

	---
	base_model:
	- RozGrov/NemoDori-v0.2-12B-MN-BT
	- crestf411/nemo-sunfall-v0.6.1
	- unsloth/Mistral-Nemo-Instruct-2407
	- UsernameJustAnother/Nemo-12B-Marlin-v5
	library_name: transformers
	tags:
	- mergekit
	- merge

	---
	# NemoDori-v0.2.2-12B-MN-ties

	This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	The second child from [NemoDori-v0.2-12B-MN-BT](https://huggingface.co/RozGrov/NemoDori-v0.2-12B-MN-BT), a sibling to [v0.2.1](https://huggingface.co/RozGrov/NemoDori-v0.2.1-12B-MN-BT).

	The purpose is to find a way to increase v0.2 capability to stay aware of the past conversations and follow instructions better, especially the last one (depth-0),
	while keeping it's creativity and capability to (E)RP.
	This model is one of the few childs to try to fulfill that.

	In my short testing so far, I think it's slightly more aware of what's in the past and what it's instructed to do, but the response format is not very consistent. Though, this might be because of my testing temp or bad instructions.
	<br>
	You can go up until temp 2 with this boy (just like it predecessor [v0.1](https://huggingface.co/RozGrov/NemoDori-v0.1-12B-MS)), but it will not satisfy you, because it'll spew out some old english in a modern way kind of thing.
	<br>
	Anyway, tweak the preset all you want in ST (the harmless ones), it will still able to respond correctly. Use preset in [v0.1](https://huggingface.co/RozGrov/NemoDori-v0.1-12B-MS) if not.

	You may give me feedback on anything, or guide me how I can fulfill my-ahem it's purpose while keeping it as low as not-70B.
	<br>
	Fine-tune is... pretty expensive for me, and I'm not ready for that (yet, tho i'm interested).

	## Merge Details
	### Merge Method

	This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [RozGrov/NemoDori-v0.2-12B-MN-BT](https://huggingface.co/RozGrov/NemoDori-v0.2-12B-MN-BT) as a base.

	### Models Merged

	The following models were included in the merge:
	* [crestf411/nemo-sunfall-v0.6.1](https://huggingface.co/crestf411/nemo-sunfall-v0.6.1)
	* [unsloth/Mistral-Nemo-Instruct-2407](https://huggingface.co/unsloth/Mistral-Nemo-Instruct-2407)
	* [UsernameJustAnother/Nemo-12B-Marlin-v5](https://huggingface.co/UsernameJustAnother/Nemo-12B-Marlin-v5)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml

	models:
	- model: crestf411/nemo-sunfall-v0.6.1
	parameters:
	weight: 0.22
	- model: UsernameJustAnother/Nemo-12B-Marlin-v5
	parameters:
	weight: 0.22
	- model: unsloth/Mistral-Nemo-Instruct-2407
	parameters:
	weight: 0.22
	- model: RozGrov/NemoDori-v0.2-12B-MN-BT
	parameters:
	weight: 0.9
	merge_method: ties
	base_model: RozGrov/NemoDori-v0.2-12B-MN-BT
	parameters:
	density: 0.93
	dtype: bfloat16

	```