SPAHE
/

Meltemi-7B-Instruct-v1-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

ibalampanis commited on Mar 29

Commit

2c891bd

•

1 Parent(s): e32a928

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -28,9 +28,9 @@ This repository contains GGUF format model files for [ilsp's Meltemi 7B Instruct
 | Name                                                                                                                                    | Quantization Method | Precision (Bits) | File Size | Max RAM Required | Use Case                                                      |
 | --------------------------------------------------------------------------------------------------------------------------------------- | ------------------- | ---------------- | --------- | ---------------- | ------------------------------------------------------------- |
-| [meltemi-7b-instruct-v1_q8_0.gguf](https://huggingface.co/SPAHE/Meltemi-7B-Instruct-v1-GGUF/blob/main/meltemi-7b-instruct-v1_q8_0.gguf) | Q8_0                | 8                | 7.40 GB   | 7.30 GB          | Low quality loss - recommended                                |
-| [meltemi-7b-instruct-v1_f16.gguf](https://huggingface.co/SPAHE/Meltemi-7B-Instruct-v1-GGUF/blob/main/meltemi-7b-instruct-v1_f16.gguf)   | F16                 | 16               | 13.90 GB  | 14.20 GB         | Very large, extremely low quality loss - recommended          |
-| [meltemi-7b-instruct-v1_f32.gguf](https://huggingface.co/SPAHE/Meltemi-7B-Instruct-v1-GGUF/blob/main/meltemi-7b-instruct-v1_f32.gguf)   | F32                 | 32               | 27.80 GB  | 29.30 GB         | Very very large, extremely low quality loss - not recommended |
 **Note**: The above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.

 | Name                                                                                                                                    | Quantization Method | Precision (Bits) | File Size | Max RAM Required | Use Case                                                      |
 | --------------------------------------------------------------------------------------------------------------------------------------- | ------------------- | ---------------- | --------- | ---------------- | ------------------------------------------------------------- |
+| [meltemi-7b-instruct-v1_q8_0.gguf](https://huggingface.co/SPAHE/Meltemi-7B-Instruct-v1-GGUF/blob/main/meltemi-7b-instruct-v1_q8_0.gguf) | Q8_0                | 8                | 7.95 GB   | 7.30 GB          | Low quality loss - recommended                                |
+| [meltemi-7b-instruct-v1_f16.gguf](https://huggingface.co/SPAHE/Meltemi-7B-Instruct-v1-GGUF/blob/main/meltemi-7b-instruct-v1_f16.gguf)   | F16                 | 16               | 15.00 GB  | 14.20 GB         | Very large, extremely low quality loss - recommended          |
+| [meltemi-7b-instruct-v1_f32.gguf](https://huggingface.co/SPAHE/Meltemi-7B-Instruct-v1-GGUF/blob/main/meltemi-7b-instruct-v1_f32.gguf)   | F32                 | 32               | 27.90 GB  | 29.30 GB         | Very very large, extremely low quality loss - not recommended |
 **Note**: The above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.