Initial merged FP16 model commit
Browse files
README.md
CHANGED
@@ -17,9 +17,9 @@ license: other
|
|
17 |
</div>
|
18 |
<!-- header end -->
|
19 |
|
20 |
-
# LmSys' Vicuna 13B 1.3.0
|
21 |
|
22 |
-
These files are GPTQ 4bit model files for [LmSys' Vicuna 13B 1.3.0
|
23 |
|
24 |
[Kaio Ken's SuperHOT 30B LoRA](https://huggingface.co/kaiokendev/superhot-30b-8k-no-rlhf-test) is merged on to the base model, and then 8K context can be achieved during inference by using `trust_remote_code=True`.
|
25 |
|
@@ -27,7 +27,7 @@ Note that `config.json` has been set to a sequence length of 8192. This can be m
|
|
27 |
|
28 |
## Repositories available
|
29 |
|
30 |
-
* [4-bit GPTQ models for GPU inference](https://huggingface.co/
|
31 |
* [Unquantised SuperHOT fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/%%REPO_FP16%%)
|
32 |
* [Unquantised base fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/lmsys/vicuna-13b-v1.3)
|
33 |
|
@@ -91,46 +91,6 @@ I trained the LoRA with the following configuration:
|
|
91 |
- AdamW beta1 of 0.9 and beta2 0.99, epsilon of 1e-5
|
92 |
- Trained on 4-bit base model
|
93 |
|
94 |
-
# Original model card: LmSys' Vicuna 13B 1.3.0
|
95 |
|
96 |
-
|
97 |
-
# Vicuna Model Card
|
98 |
-
|
99 |
-
## Model Details
|
100 |
-
|
101 |
-
Vicuna is a chat assistant trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT.
|
102 |
-
|
103 |
-
- **Developed by:** [LMSYS](https://lmsys.org/)
|
104 |
-
- **Model type:** An auto-regressive language model based on the transformer architecture.
|
105 |
-
- **License:** Non-commercial license
|
106 |
-
- **Finetuned from model:** [LLaMA](https://arxiv.org/abs/2302.13971).
|
107 |
-
|
108 |
-
### Model Sources
|
109 |
-
|
110 |
-
- **Repository:** https://github.com/lm-sys/FastChat
|
111 |
-
- **Blog:** https://lmsys.org/blog/2023-03-30-vicuna/
|
112 |
-
- **Paper:** https://arxiv.org/abs/2306.05685
|
113 |
-
- **Demo:** https://chat.lmsys.org/
|
114 |
-
|
115 |
-
## Uses
|
116 |
-
|
117 |
-
The primary use of Vicuna is research on large language models and chatbots.
|
118 |
-
The primary intended users of the model are researchers and hobbyists in natural language processing, machine learning, and artificial intelligence.
|
119 |
-
|
120 |
-
## How to Get Started with the Model
|
121 |
-
|
122 |
-
Command line interface: https://github.com/lm-sys/FastChat#vicuna-weights.
|
123 |
-
APIs (OpenAI API, Huggingface API): https://github.com/lm-sys/FastChat/tree/main#api.
|
124 |
-
|
125 |
-
## Training Details
|
126 |
-
|
127 |
-
Vicuna v1.3 is fine-tuned from LLaMA with supervised instruction fine-tuning.
|
128 |
-
The training data is around 140K conversations collected from ShareGPT.com.
|
129 |
-
See more details in the "Training Details of Vicuna Models" section in the appendix of this [paper](https://arxiv.org/pdf/2306.05685.pdf).
|
130 |
-
|
131 |
-
## Evaluation
|
132 |
-
|
133 |
-
Vicuna is evaluated with standard benchmarks, human preference, and LLM-as-a-judge. See more details in this [paper](https://arxiv.org/pdf/2306.05685.pdf).
|
134 |
-
|
135 |
-
## Difference between different versions of Vicuna
|
136 |
-
See [vicuna_weights_version.md](https://github.com/lm-sys/FastChat/blob/main/docs/vicuna_weights_version.md)
|
|
|
17 |
</div>
|
18 |
<!-- header end -->
|
19 |
|
20 |
+
# LmSys' Vicuna 13B 1.3.0 fp16
|
21 |
|
22 |
+
These files are GPTQ 4bit model files for [LmSys' Vicuna 13B 1.3.0](https://huggingface.co/lmsys/vicuna-13b-v1.3) merged with [Kaio Ken's SuperHOT 8K](https://huggingface.co/kaiokendev/superhot-30b-8k-no-rlhf-test).
|
23 |
|
24 |
[Kaio Ken's SuperHOT 30B LoRA](https://huggingface.co/kaiokendev/superhot-30b-8k-no-rlhf-test) is merged on to the base model, and then 8K context can be achieved during inference by using `trust_remote_code=True`.
|
25 |
|
|
|
27 |
|
28 |
## Repositories available
|
29 |
|
30 |
+
* [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/Vicuna-13B-1.3.0-SuperHOT-8K-fp16)
|
31 |
* [Unquantised SuperHOT fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/%%REPO_FP16%%)
|
32 |
* [Unquantised base fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/lmsys/vicuna-13b-v1.3)
|
33 |
|
|
|
91 |
- AdamW beta1 of 0.9 and beta2 0.99, epsilon of 1e-5
|
92 |
- Trained on 4-bit base model
|
93 |
|
94 |
+
# Original model card: LmSys' Vicuna 13B 1.3.0
|
95 |
|
96 |
+
No original model card was provided.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|