update llama-cli example
Browse files
README.md
CHANGED
@@ -54,8 +54,26 @@ huggingface-cli download internlm/internlm2_5-7b-chat-gguf internlm2_5-7b-chat-f
|
|
54 |
## Inference
|
55 |
|
56 |
You can use `llama-cli` for conducting inference. For a detailed explanation of `llama-cli`, please refer to [this guide](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md)
|
|
|
57 |
```shell
|
58 |
-
build/bin/llama-cli
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
59 |
```
|
60 |
|
61 |
## Serving
|
|
|
54 |
## Inference
|
55 |
|
56 |
You can use `llama-cli` for conducting inference. For a detailed explanation of `llama-cli`, please refer to [this guide](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md)
|
57 |
+
|
58 |
```shell
|
59 |
+
build/bin/llama-cli \
|
60 |
+
--model internlm2_5-7b-chat-fp16.gguf \
|
61 |
+
--predict 512 \
|
62 |
+
--ctx-size 4096 \
|
63 |
+
--gpu-layers 32 \
|
64 |
+
--temp 0.8 \
|
65 |
+
--top-p 0.8 \
|
66 |
+
--top-k 50 \
|
67 |
+
--seed 1024 \
|
68 |
+
--color \
|
69 |
+
--prompt "<|im_start|>system\nYou are an AI assistant whose name is InternLM (书生·浦语).\n- InternLM (书生·浦语) is a conversational language model that is developed by Shanghai AI Laboratory (上海人工智能实验室). It is designed to be helpful, honest, and harmless.\n- InternLM (书生·浦语) can understand and communicate fluently in the language chosen by the user such as English and 中文.<|im_end|>\n" \
|
70 |
+
--interactive \
|
71 |
+
--multiline-input \
|
72 |
+
--conversation \
|
73 |
+
--verbose \
|
74 |
+
--logdir workdir/logdir \
|
75 |
+
--in-prefix "<|im_start|>user\n" \
|
76 |
+
--in-suffix "<|im_end|>\n<|im_start|>assistant\n"
|
77 |
```
|
78 |
|
79 |
## Serving
|