Sdff-Ltba
/

LightChatAssistant-2x7B-GGUF

Text Generation

Mixture of Experts

Not-For-All-Audiences

nsfw

Inference Endpoints

Model card Files Files and versions Community

LightChatAssistant-2x7B-GGUF / README.md

Sdff-Ltba's picture

Update README.md

a142e05 verified 8 months ago

|

1.32 kB

	---
	base_model:
	- NTQAI/chatntq-ja-7b-v1.0
	- Elizezen/Antler-7B
	language:
	- ja
	tags:
	- mistral
	- mixtral
	- merge
	- moe
	- not-for-all-audiences
	- nsfw
	pipeline_tag: text-generation
	---

	# LightChatAssistant-2x7B-GGUF

	[Sdff-Ltba/LightChatAssistant-2x7B](https://huggingface.co/Sdff-Ltba/LightChatAssistant-2x7B)をGGUF変換したものです。
	ファイル名に`_imatrix`が付いているものはiMatrixを併用して量子化しています。
	※製作者は本モデルを使うときはiQ3_XXSを使用しています。

	## 量子化手順

	以下の通りに実行しました。(iMatrixを併用してiQ3_XXSにする場合)
	```
	python ./llama.cpp/convert.py ./LightChatAssistant-2x7B --outtype f16 --outfile ./gguf-model_f16.gguf
	./llama.cpp/imatrix -m ./gguf-model_f16.gguf -f ./wiki.train.raw -o ./gguf-model_f16.imatrix --chunks 32
	./llama.cpp/quantize --imatrix ./gguf-model_f16.imatrix ./gguf-model_f16.gguf ./LightChatAssistant-2x7B_iq3xxs.gguf iq3_xxs
	```

	## 環境

	- CPU: Ryzen 5 5600X
	- GPU: GeForce RTX 3060 12GB
	- RAM: DDR4-3200 96GB
	- OS: Windows 10
	- software: Python 3.12.2、[KoboldCpp](https://github.com/LostRuins/koboldcpp) v1.61.2

	#### KoboldCppの設定

	(デフォルトから変更したもののみ記載)
	- `GPU Layers: 33` (33以上でフルロード)
	- `Context Size: 32768`