File size: 1,721 Bytes
0dfa493 a890433 0dfa493 b42db76 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
---
license: other
datasets:
- georgesung/wizard_vicuna_70k_unfiltered
---
# Overview
Fine-tuned [Llama-3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) with an uncensored/unfiltered Wizard-Vicuna conversation dataset.
Used QLoRA for fine-tuning.
The model here includes the fp32 HuggingFace version, plus a quantized 4-bit q4_0 [gguf version](https://huggingface.co/georgesung/llama3_8b_chat_uncensored/resolve/main/llama3_8b_chat_uncensored_q4_0.gguf?download=true).
# Prompt style
The model was trained with the following prompt style:
```
### HUMAN:
Hello
### RESPONSE:
Hi, how are you?
### HUMAN:
I'm fine.
### RESPONSE:
How can I help you?
...
```
# Training code
Code used to train the model is available [here](https://github.com/georgesung/llm_qlora).
To reproduce the results:
```
git clone https://github.com/georgesung/llm_qlora
cd llm_qlora
pip install -r requirements.txt
python train.py configs/llama3_8b_chat_uncensored.yaml
```
# Fine-tuning guide
https://georgesung.github.io/ai/qlora-ift/
# Ollama inference
First, install [Ollama](https://ollama.com/). Based on instructions [here](https://github.com/ollama/ollama/blob/main/README.md#import-from-gguf), run the following:
```
cd $MODEL_DIR_OF_CHOICE
wget https://huggingface.co/georgesung/llama3_8b_chat_uncensored/resolve/main/llama3_8b_chat_uncensored_q4_0.gguf
```
Create a file called `llama3-uncensored.modelfile` with the following:
```
FROM ./llama3_8b_chat_uncensored_q4_0.gguf
TEMPLATE """{{ .System }}
### HUMAN:
{{ .Prompt }}
### RESPONSE:
"""
PARAMETER stop "### HUMAN:"
PARAMETER stop "### RESPONSE:"
```
Then run:
```
ollama create llama3-uncensored -f llama3-uncensored.modelfile
ollama run llama3-uncensored
```
|