File size: 1,721 Bytes

---
license: other
datasets:
- georgesung/wizard_vicuna_70k_unfiltered
---

# Overview
Fine-tuned [Llama-3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) with an uncensored/unfiltered Wizard-Vicuna conversation dataset.
Used QLoRA for fine-tuning.

The model here includes the fp32 HuggingFace version, plus a quantized 4-bit q4_0 [gguf version](https://huggingface.co/georgesung/llama3_8b_chat_uncensored/resolve/main/llama3_8b_chat_uncensored_q4_0.gguf?download=true).

# Prompt style
The model was trained with the following prompt style:
```
### HUMAN:
Hello

### RESPONSE:
Hi, how are you?

### HUMAN:
I'm fine.

### RESPONSE:
How can I help you?
...
```

# Training code
Code used to train the model is available [here](https://github.com/georgesung/llm_qlora).

To reproduce the results:
```
git clone https://github.com/georgesung/llm_qlora
cd llm_qlora
pip install -r requirements.txt
python train.py configs/llama3_8b_chat_uncensored.yaml
```

# Fine-tuning guide
https://georgesung.github.io/ai/qlora-ift/

# Ollama inference
First, install [Ollama](https://ollama.com/). Based on instructions [here](https://github.com/ollama/ollama/blob/main/README.md#import-from-gguf), run the following:
```
cd $MODEL_DIR_OF_CHOICE
wget https://huggingface.co/georgesung/llama3_8b_chat_uncensored/resolve/main/llama3_8b_chat_uncensored_q4_0.gguf
```

Create a file called `llama3-uncensored.modelfile` with the following:
```
FROM ./llama3_8b_chat_uncensored_q4_0.gguf
TEMPLATE """{{ .System }}

### HUMAN:
{{ .Prompt }}

### RESPONSE:
"""
PARAMETER stop "### HUMAN:"
PARAMETER stop "### RESPONSE:"
```

Then run:
```
ollama create llama3-uncensored -f llama3-uncensored.modelfile
ollama run llama3-uncensored
```