second-state
/

CodeLlama-70b-Instruct-hf-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Xin Liu commited on Feb 1

Commit

8044ba2

•

1 Parent(s): b3be26a

Update

Signed-off-by: Xin Liu <sam@secondstate.io>

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -47,6 +47,8 @@ tags:
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:CodeLlama-70b-Instruct-hf-Q2_K.gguf llama-api-server.wasm -p codellama-super-instruct -c 1024 --reverse-prompt 'Source: assistant\nEOT: true'
   ```
 ## Quantized GGUF Models
 | Name | Quant method | Bits | Size | Use case |

   wasmedge --dir .:. --nn-preload default:GGML:AUTO:CodeLlama-70b-Instruct-hf-Q2_K.gguf llama-api-server.wasm -p codellama-super-instruct -c 1024 --reverse-prompt 'Source: assistant\nEOT: true'
   ```
+  **Note that the model only works in the non-streaming mode.**
 ## Quantized GGUF Models
 | Name | Quant method | Bits | Size | Use case |