GGUF
Inference Endpoints
aashish1904 commited on
Commit
6a1b664
·
verified ·
1 Parent(s): 23d94aa

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +76 -0
README.md ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ license: other
5
+ datasets:
6
+ - georgesung/wizard_vicuna_70k_unfiltered
7
+
8
+ ---
9
+
10
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
11
+
12
+
13
+ # QuantFactory/llama2_7b_chat_uncensored-GGUF
14
+ This is quantized version of [georgesung/llama2_7b_chat_uncensored](https://huggingface.co/georgesung/llama2_7b_chat_uncensored) created using llama.cpp
15
+
16
+ # Original Model Card
17
+
18
+
19
+ # Overview
20
+ Fine-tuned [Llama-2 7B](https://huggingface.co/TheBloke/Llama-2-7B-fp16) with an uncensored/unfiltered Wizard-Vicuna conversation dataset (originally from [ehartford/wizard_vicuna_70k_unfiltered](https://huggingface.co/datasets/ehartford/wizard_vicuna_70k_unfiltered)).
21
+ Used QLoRA for fine-tuning. Trained for one epoch on a 24GB GPU (NVIDIA A10G) instance, took ~19 hours to train.
22
+
23
+ The version here is the fp16 HuggingFace model.
24
+
25
+ ## GGML & GPTQ versions
26
+ Thanks to [TheBloke](https://huggingface.co/TheBloke), he has created the GGML and GPTQ versions:
27
+ * https://huggingface.co/TheBloke/llama2_7b_chat_uncensored-GGML
28
+ * https://huggingface.co/TheBloke/llama2_7b_chat_uncensored-GPTQ
29
+
30
+ ## Running in Ollama
31
+ https://ollama.com/library/llama2-uncensored
32
+
33
+ # Prompt style
34
+ The model was trained with the following prompt style:
35
+ ```
36
+ ### HUMAN:
37
+ Hello
38
+
39
+ ### RESPONSE:
40
+ Hi, how are you?
41
+
42
+ ### HUMAN:
43
+ I'm fine.
44
+
45
+ ### RESPONSE:
46
+ How can I help you?
47
+ ...
48
+ ```
49
+
50
+ # Training code
51
+ Code used to train the model is available [here](https://github.com/georgesung/llm_qlora).
52
+
53
+ To reproduce the results:
54
+ ```
55
+ git clone https://github.com/georgesung/llm_qlora
56
+ cd llm_qlora
57
+ pip install -r requirements.txt
58
+ python train.py configs/llama2_7b_chat_uncensored.yaml
59
+ ```
60
+
61
+ # Fine-tuning guide
62
+ https://georgesung.github.io/ai/qlora-ift/
63
+
64
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
65
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_georgesung__llama2_7b_chat_uncensored)
66
+
67
+ | Metric | Value |
68
+ |-----------------------|---------------------------|
69
+ | Avg. | 43.39 |
70
+ | ARC (25-shot) | 53.58 |
71
+ | HellaSwag (10-shot) | 78.66 |
72
+ | MMLU (5-shot) | 44.49 |
73
+ | TruthfulQA (0-shot) | 41.34 |
74
+ | Winogrande (5-shot) | 74.11 |
75
+ | GSM8K (5-shot) | 5.84 |
76
+ | DROP (3-shot) | 5.69 |