georgesung
/

llama3_8b_chat_uncensored

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

llama3_8b_chat_uncensored / README.md

georgesung's picture

Update README.md

b42db76 9 months ago

|

history blame contribute delete

1.72 kB

	---
	license: other
	datasets:
	- georgesung/wizard_vicuna_70k_unfiltered
	---

	# Overview
	Fine-tuned [Llama-3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) with an uncensored/unfiltered Wizard-Vicuna conversation dataset.
	Used QLoRA for fine-tuning.

	The model here includes the fp32 HuggingFace version, plus a quantized 4-bit q4_0 [gguf version](https://huggingface.co/georgesung/llama3_8b_chat_uncensored/resolve/main/llama3_8b_chat_uncensored_q4_0.gguf?download=true).

	# Prompt style
	The model was trained with the following prompt style:
	```
	### HUMAN:
	Hello

	### RESPONSE:
	Hi, how are you?

	### HUMAN:
	I'm fine.

	### RESPONSE:
	How can I help you?
	...
	```

	# Training code
	Code used to train the model is available [here](https://github.com/georgesung/llm_qlora).

	To reproduce the results:
	```
	git clone https://github.com/georgesung/llm_qlora
	cd llm_qlora
	pip install -r requirements.txt
	python train.py configs/llama3_8b_chat_uncensored.yaml
	```

	# Fine-tuning guide
	https://georgesung.github.io/ai/qlora-ift/

	# Ollama inference
	First, install [Ollama](https://ollama.com/). Based on instructions [here](https://github.com/ollama/ollama/blob/main/README.md#import-from-gguf), run the following:
	```
	cd $MODEL_DIR_OF_CHOICE
	wget https://huggingface.co/georgesung/llama3_8b_chat_uncensored/resolve/main/llama3_8b_chat_uncensored_q4_0.gguf
	```

	Create a file called `llama3-uncensored.modelfile` with the following:
	```
	FROM ./llama3_8b_chat_uncensored_q4_0.gguf
	TEMPLATE """{{ .System }}

	### HUMAN:
	{{ .Prompt }}

	### RESPONSE:
	"""
	PARAMETER stop "### HUMAN:"
	PARAMETER stop "### RESPONSE:"
	```

	Then run:
	```
	ollama create llama3-uncensored -f llama3-uncensored.modelfile
	ollama run llama3-uncensored
	```