webbigdata
/

ALMA-7B-Ja-V2

Text Generation

text-generation-inference

Model card Files Files and versions Community

dahara1 commited on Nov 12, 2023

Commit

c249758

•

1 Parent(s): 21ad81c

Update README.md

Files changed (1) hide show

README.md +12 -1

README.md CHANGED Viewed

@@ -121,7 +121,18 @@ Using Colab, Google's free web tool, you can easily verify the performance of AL
 ## その他の版 Other Version
-### ALMA-7B-Ja-V2-GPTQ-Ja-En
 GPTQはモデルサイズを小さくする手法(量子化といいます)です。
 GPTQ is a technique (called quantization) that reduces model size.

 ## その他の版 Other Version
+### llama.cpp
+[llama.cpp](https://github.com/ggerganov/llama.cpp)の主な目的はMacBook上で4ビット整数量子化を使用して LLaMA モデルを実行する事です。
+The main purpose of [llama.cpp](https://github.com/ggerganov/llama.cpp) is to run the LLaMA model using 4-bit integer quantization on a MacBook.
+4ビット量子化に伴い、性能はやや低下しますが、mmngaさんが作成してくれた[webbigdata-ALMA-7B-Ja-V2-gguf](https://huggingface.co/mmnga/webbigdata-ALMA-7B-Ja-V2-gguf)を使うとMacやGPUを搭載していないWindows、Linuxで本モデルを動かす事ができます。
+Although performance is somewhat reduced with 4-bit quantization, [webbigdata-ALMA-7B-Ja-V2-gguf](https://huggingface.co/mmnga/webbigdata-ALMA-7B-Ja-V2-gguf), created by mmnga, can be used to run this model on Mac, Windows and Linux without a GPU.
+[GPU無版のColabで動かすサンプルはこちら](https://github.com/webbigdata-jp/python_sample/blob/main/ALMA_7B_Ja_V2_gguf_Free_Colab_sample.ipynb)です。
+[Here is Colab(without GPU) sample code](https://github.com/webbigdata-jp/python_sample/blob/main/ALMA_7B_Ja_V2_gguf_Free_Colab_sample.ipynb).
+### GPTQ
 GPTQはモデルサイズを小さくする手法(量子化といいます)です。
 GPTQ is a technique (called quantization) that reduces model size.