npc0
/

DISC-MedLLM-ggml

Text Generation

Model card Files Files and versions

npc0 commited on Oct 31, 2023

Commit

0d01c4c

·

1 Parent(s): 9be5029

Update README.md

Files changed (1) hide show

README.md +18 -1

README.md CHANGED Viewed

@@ -1,11 +1,28 @@
 ---
 license: apache-2.0
 ---
 This repository contains the quantized DISC-MedLLM, version of Baichuan-13b-base as the base model.
 The weights are converted to GGML format using [baichuan13b.cpp](https://github.com/ouwei2013/baichuan13b.cpp) (based on [llama.cpp](https://github.com/ggerganov/llama.cpp))
 ## How to inference
 1. [Compile baichuan13b](https://github.com/ouwei2013/baichuan13b.cpp#build), a main executable `baichuan13b/build/bin/main` and a server `baichuan13b/build/bin/server` will be generated.
 2. Download the weight in this repository to `baichuan13b/build/bin/`
@@ -41,4 +58,4 @@ llm_output = requests.post(
   "n_predict": 512
 }).json()
 print(llm_output)
-```

 ---
 license: apache-2.0
+datasets:
+- Flmc/DISC-Med-SFT
+language:
+- zh
+pipeline_tag: text-generation
+tags:
+- baichuan
+- medical
+- ggml
 ---
 This repository contains the quantized DISC-MedLLM, version of Baichuan-13b-base as the base model.
 The weights are converted to GGML format using [baichuan13b.cpp](https://github.com/ouwei2013/baichuan13b.cpp) (based on [llama.cpp](https://github.com/ggerganov/llama.cpp))
+|Model               |GGML quantize method| HDD size |
+|--------------------|--------------------|----------|
+|ggml-model-q4_0.bin |        q4_0        |  7.55 GB |
+|ggml-model-q4_1.bin |        q4_1        |  8.36 GB |
+|ggml-model-q5_0.bin |        q5_0        |  9.17 GB |
+|ggml-model-q5_1.bin |        q5_1        |  9.97 GB |
+<!-- |ggml-model-q8_0.bin |        q8_0        |   ?.?? GB | -->
 ## How to inference
 1. [Compile baichuan13b](https://github.com/ouwei2013/baichuan13b.cpp#build), a main executable `baichuan13b/build/bin/main` and a server `baichuan13b/build/bin/server` will be generated.
 2. Download the weight in this repository to `baichuan13b/build/bin/`
   "n_predict": 512
 }).json()
 print(llm_output)
+```