TheBloke
/

Vicuna-13B-1-3-SuperHOT-8K-fp16

@@ -17,9 +17,9 @@ license: other
 </div>
 <!-- header end -->
-# LmSys' Vicuna 13B 1.3.0 merged with Kaio Ken's SuperHOT 8K fp16
-These files are GPTQ 4bit model files for [LmSys' Vicuna 13B 1.3.0 merged with Kaio Ken's SuperHOT 8K](https://huggingface.co/lmsys/vicuna-13b-v1.3) merged with [Kaio Ken's SuperHOT 8K](https://huggingface.co/kaiokendev/superhot-30b-8k-no-rlhf-test).
 [Kaio Ken's SuperHOT 30B LoRA](https://huggingface.co/kaiokendev/superhot-30b-8k-no-rlhf-test) is merged on to the base model, and then 8K context can be achieved during inference by using `trust_remote_code=True`.
@@ -27,7 +27,7 @@ Note that `config.json` has been set to a sequence length of 8192. This can be m
 ## Repositories available
-* [4-bit GPTQ models for GPU inference](https://huggingface.co/lmsys/vicuna-13b-v1.3)
 * [Unquantised SuperHOT fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/%%REPO_FP16%%)
 * [Unquantised base fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/lmsys/vicuna-13b-v1.3)
@@ -91,46 +91,6 @@ I trained the LoRA with the following configuration:
   - AdamW beta1 of 0.9 and beta2 0.99, epsilon of 1e-5
   - Trained on 4-bit base model
-# Original model card: LmSys' Vicuna 13B 1.3.0 merged with Kaio Ken's SuperHOT 8K
-# Vicuna Model Card
-## Model Details
-Vicuna is a chat assistant trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT.
-- **Developed by:** [LMSYS](https://lmsys.org/)
-- **Model type:** An auto-regressive language model based on the transformer architecture.
-- **License:** Non-commercial license
-- **Finetuned from model:** [LLaMA](https://arxiv.org/abs/2302.13971).
-### Model Sources
-- **Repository:** https://github.com/lm-sys/FastChat
-- **Blog:** https://lmsys.org/blog/2023-03-30-vicuna/
-- **Paper:** https://arxiv.org/abs/2306.05685
-- **Demo:** https://chat.lmsys.org/
-## Uses
-The primary use of Vicuna is research on large language models and chatbots.
-The primary intended users of the model are researchers and hobbyists in natural language processing, machine learning, and artificial intelligence.
-## How to Get Started with the Model
-Command line interface: https://github.com/lm-sys/FastChat#vicuna-weights.
-APIs (OpenAI API, Huggingface API): https://github.com/lm-sys/FastChat/tree/main#api.
-## Training Details
-Vicuna v1.3 is fine-tuned from LLaMA with supervised instruction fine-tuning.
-The training data is around 140K conversations collected from ShareGPT.com.
-See more details in the "Training Details of Vicuna Models" section in the appendix of this [paper](https://arxiv.org/pdf/2306.05685.pdf).
-## Evaluation
-Vicuna is evaluated with standard benchmarks, human preference, and LLM-as-a-judge. See more details in this [paper](https://arxiv.org/pdf/2306.05685.pdf).
-## Difference between different versions of Vicuna
-See [vicuna_weights_version.md](https://github.com/lm-sys/FastChat/blob/main/docs/vicuna_weights_version.md)

 </div>
 <!-- header end -->
+# LmSys' Vicuna 13B 1.3.0 fp16
+These files are GPTQ 4bit model files for [LmSys' Vicuna 13B 1.3.0](https://huggingface.co/lmsys/vicuna-13b-v1.3) merged with [Kaio Ken's SuperHOT 8K](https://huggingface.co/kaiokendev/superhot-30b-8k-no-rlhf-test).
 [Kaio Ken's SuperHOT 30B LoRA](https://huggingface.co/kaiokendev/superhot-30b-8k-no-rlhf-test) is merged on to the base model, and then 8K context can be achieved during inference by using `trust_remote_code=True`.
 ## Repositories available
+* [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/Vicuna-13B-1.3.0-SuperHOT-8K-fp16)
 * [Unquantised SuperHOT fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/%%REPO_FP16%%)
 * [Unquantised base fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/lmsys/vicuna-13b-v1.3)
   - AdamW beta1 of 0.9 and beta2 0.99, epsilon of 1e-5
   - Trained on 4-bit base model
+# Original model card: LmSys' Vicuna 13B 1.3.0
+No original model card was provided.