这是以Yi-34B-Llama为底座重新合并的模型,原本200K上下文底座在合并了几个非200K上下文LoRA后的效果好象不太行,所以使用与LoRA相配套的底座重新合并。 底座是4096上下文,按照Y34B原本的说法推理时支持最大32K上下文(Alpha 8),本人推建8K上下文(Alpha 2.5)。 这次合并我改了下LoRA合并的顺序,将limarpv3切到最后合并。
acsr-y34b-4bpw-hb6-exl2
- base model: Yi-34B-Llama
- LoRA: Yi-34b-alpaca-cot-lora 支持Alpaca格式
- LoRA: Yi-34B-Spicyboros-3.1-LoRA 非官方对话数据集
- LoRA: limarpv3-yi-llama-34b-lora 扮演类长回复
description
- This is test for exllamav2 model.
- 4bpw
python convert.py -i acsr-y34b -c exl2/0000.parquet -o acsr-y34b-4bpw-hb6-exl2 -hb 6 -l 4096 -b 4.15
- convert doc
- calibration dataset: WikiText-2-v1
- oobabooga/text-generation-webui must add
--trust-remote-code
into CMD_FLAGS.txt and use ExLlamav2 to load model
- Downloads last month
- 16
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.