TheBloke
/

WizardLM-Uncensored-Falcon-40B-GPTQ

@@ -1,7 +1,5 @@
 ---
 license: apache-2.0
-language:
-- en
 inference: false
 ---
@@ -21,7 +19,7 @@ inference: false
 # Eric Hartford's Samantha-Falcon-7B GPTQ
-This repo contains an experimental GPTQ 4bit model of [Eric Hartford's Samantha-Falcon-7B](https://huggingface.co/ehartford/samantha-falcon-7B).
 It is the result of quantising to 4bit using [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ).
@@ -75,7 +73,12 @@ So please first update text-genration-webui to the latest version.
 ## Prompt template
 ## About `trust-remote-code`
@@ -112,10 +115,10 @@ quantized_model_dir = "/path/to/TheBloke_WizardLM-Uncensored-Falcon-40B-GPTQ"
 from transformers import AutoTokenizer
 tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, use_fast=False)
-model = AutoGPTQForCausalLM.from_quantized(quantized_model_dir, device="cuda:0", use_triton=False, use_safetensors=True, torch_dtype=torch.float32, trust_remote_code=True)
-prompt = "Write a story about llamas"
-prompt_template = f"You are Samantha, a sentient AI.\nUSER: {prompt}\nASSISTANT:"
 tokens = tokenizer(prompt_template, return_tensors="pt").to("cuda:0").input_ids
 output = model.generate(input_ids=tokens, max_new_tokens=100, do_sample=True, temperature=0.8)
@@ -136,7 +139,7 @@ It was created without group_size to reduce VRAM usage, and with `desc_act` (act
   * Works with text-generation-webui using `--autogptq --trust_remote_code`
     * At this time it does NOT work with one-click-installers
   * Does not work with any version of GPTQ-for-LLaMa
-  * Parameters: Groupsize = 64. No act-order.
 <!-- footer start -->
 ## Discord
@@ -165,3 +168,19 @@ Thank you to all my generous patrons and donaters.
 # Original model card

 ---
 license: apache-2.0
 inference: false
 ---
 # Eric Hartford's Samantha-Falcon-7B GPTQ
+This repo contains an experimental GPTQ 4bit model of [Eric Hartford's WizardLM Uncensored Falcon 40B](https://huggingface.co/ehartford/WizardLM-Uncensored-Falcon-40b).
 It is the result of quantising to 4bit using [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ).
 ## Prompt template
+Prompt format is WizardLM.
+```
+What is a falcon?  Can I keep one as a pet?
+### Response:
+```
 ## About `trust-remote-code`
 from transformers import AutoTokenizer
 tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, use_fast=False)
+model = AutoGPTQForCausalLM.from_quantized(quantized_model_dir, device="cuda:0", use_triton=False, use_safetensors=True, torch_dtype=torch.bfloat16, trust_remote_code=True)
+prompt = "What is a falcon? Can I keep one as a pet?"
+prompt_template = f"{prompt}\n### Response:"
 tokens = tokenizer(prompt_template, return_tensors="pt").to("cuda:0").input_ids
 output = model.generate(input_ids=tokens, max_new_tokens=100, do_sample=True, temperature=0.8)
   * Works with text-generation-webui using `--autogptq --trust_remote_code`
     * At this time it does NOT work with one-click-installers
   * Does not work with any version of GPTQ-for-LLaMa
+  * Parameters: Groupsize = None. With act-order / desc_act.
 <!-- footer start -->
 ## Discord
 # Original model card
+This is WizardLM trained on top of tiiuae/falcon-40b, with a subset of the dataset - responses that contained alignment / moralizing were removed. The intent is to train a WizardLM that doesn't have alignment built-in, so that alignment (of any sort) can be added separately with for example with a RLHF LoRA.
+Shout out to the open source AI/ML community, and everyone who helped me out.
+Note:
+An uncensored model has no guardrails.
+You are responsible for anything you do with the model, just as you are responsible for anything you do with any dangerous object such as a knife, gun, lighter, or car. Publishing anything this model generates is the same as publishing it yourself. You are responsible for the content you publish, and you cannot blame the model any more than you can blame the knife, gun, lighter, or car for what you do with it.
+Prompt format is WizardLM.
+```
+What is a falcon?  Can I keep one as a pet?
+### Response:
+```
+Thank you [chirper.ai](https://chirper.ai) for sponsoring some of my compute!