second-state
/

Phi-3-mini-128k-instruct-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

apepkuss79 commited on May 26

Commit

2302e2f

•

1 Parent(s): bc12318

Update README.md

Files changed (1) hide show

README.md +6 -8

README.md CHANGED Viewed

@@ -30,9 +30,7 @@ tags:
 ## Run with LlamaEdge
-<!-- - LlamaEdge version: [v0.8.4](https://github.com/LlamaEdge/LlamaEdge/releases/tag/0.8.4) and above -->
-- LlamaEdge version: coming soon
 - Prompt template
@@ -54,14 +52,14 @@ tags:
 - Context size: `128000`
-<!-- - Run as LlamaEdge service
   ```bash
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:Phi-3-mini-128k-instruct-Q5_K_M.gguf \
     llama-api-server.wasm \
     --prompt-template phi-3-chat \
-    --ctx-size 3072 \
-    --model-name phi-3-mini
   ```
 - Run as LlamaEdge command app
@@ -70,8 +68,8 @@ tags:
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:Phi-3-mini-128k-instruct-Q5_K_M.gguf \
     llama-chat.wasm \
     --prompt-template phi-3-chat \
-    --ctx-size 3072 \
-  ``` -->
 ## Quantized GGUF Models

 ## Run with LlamaEdge
+- LlamaEdge version: [v0.11.2](https://github.com/LlamaEdge/LlamaEdge/releases/tag/0.11.2) and above
 - Prompt template
 - Context size: `128000`
+- Run as LlamaEdge service
   ```bash
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:Phi-3-mini-128k-instruct-Q5_K_M.gguf \
     llama-api-server.wasm \
     --prompt-template phi-3-chat \
+    --ctx-size 128000 \
+    --model-name phi-3-mini-128k
   ```
 - Run as LlamaEdge command app
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:Phi-3-mini-128k-instruct-Q5_K_M.gguf \
     llama-chat.wasm \
     --prompt-template phi-3-chat \
+    --ctx-size 128000
+  ```
 ## Quantized GGUF Models