Update README.md
Browse files
README.md
CHANGED
@@ -19,8 +19,7 @@ tags:
|
|
19 |
|
20 |
## Model infos:
|
21 |
FP8 (F8_E4M3) quantized version of Mistral-Nemo-Instruct-2407 with 512 epochs.
|
22 |
-
|
23 |
-
Or simply wait for vLLM 0.5.3...
|
24 |
|
25 |
```diff
|
26 |
--- vllm/model_executor/models/llama.py 2024-07-19 02:01:59.192831673 +0200
|
|
|
19 |
|
20 |
## Model infos:
|
21 |
FP8 (F8_E4M3) quantized version of Mistral-Nemo-Instruct-2407 with 512 epochs.
|
22 |
+
Tested on vLLM 0.5.3, but you need this patch to use it with vLLM 0.5.2 : https://github.com/vllm-project/vllm/pull/6548
|
|
|
23 |
|
24 |
```diff
|
25 |
--- vllm/model_executor/models/llama.py 2024-07-19 02:01:59.192831673 +0200
|