Tijmen2
/

cosmosage_v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Tijmen2 commited on Feb 19, 2024

Commit

91e8a08

·

verified ·

1 Parent(s): 24c8352

Update README.md

Files changed (1) hide show

README.md +23 -0

README.md CHANGED Viewed

@@ -83,6 +83,29 @@ The following hyperparameters were used during QA tuning:
 - num_epochs: 2.0
 - weight_decay: 0.0
 ## Example output
 **User:**

 - num_epochs: 2.0
 - weight_decay: 0.0
+## Versions
+This repository contains:
+ - pytorch_model.bin: standard version (bfloat16)
+ - model.safetensors: same as pytorch_mode.bin but in safetensors format
+ - gptq_model-8bit-128g.safetensors: 8-bit quantized version for inference speedup and low-VRAM GPUs
+ - gptq_model-4bit-128g.safetensors: 4-bit quantized version for even faster inference, lower VRAM requirements, lower quality
+When using one of the quantized versions, make sure to pass the quantization configuration:
+```json
+{
+  "bits": <4 or 8 depending on the version>,
+  "group_size": 128,
+  "damp_percent": 0.01,
+  "desc_act": false,
+  "static_groups": false,
+  "sym": true,
+  "true_sequential": true,
+  "model_name_or_path": null,
+  "model_file_base_name": null
+}
+```
 ## Example output
 **User:**