Thireus commited on
Commit
9e21be1
β€’
1 Parent(s): edb8490

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -10,7 +10,7 @@ quantized_by: Thireus
10
 
11
  # WizardLM 70B V1.0 – EXL2
12
  - Model creator: [WizardLM](https://huggingface.co/WizardLM)
13
- - Original model: [WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0)
14
  - FP16 Model used for quantization: [WizardLM 70B V1.0-HF](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF) – float16 of [WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0)
15
  - BF16 Model used for quantization: [WizardLM 70B V1.0-BF16](https://huggingface.co/Thireus/WizardLM-70B-V1.0-BF16) – bfloat16 of [WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0)
16
 
@@ -19,10 +19,12 @@ quantized_by: Thireus
19
  | Link | BITS (-b) | HEAD BITS (-hb) | MEASU-REMENT LENGTH (-ml) | LENGTH (-l) | CAL DATASET (-c) | Size | V. | Max Context Length | Base Model | Layers | VRAM Min | VRAM Max | PPL**
20
  | ------ | --------- | --------------- | ------------------------ | ----------- | ---------------- | ---- | ------- | ------------------ | ---- | ---- |------------------ | ------------------ | ------------------ |
21
  | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-4.0bpw-h6-exl2/) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 35GB | [0.0.1](https://github.com/turboderp/exllamav2/tree/aee7a281708d5faff2ad0ea4b3a3a4b754f458f3) | 4096 | [FP16](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF) | 80 | 40GB | 44GB | 4.1640625 |
22
- | _soon_ | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | ..GB | [0.0.2](https://github.com/turboderp/exllamav2/tree/ec5164b8a8e282b91aedb2af94dfeb89887656b7) | 4096 | [BF16](https://huggingface.co/Thireus/WizardLM-70B-V1.0-BF16) | 80 | ..GB | ..GB | .. |
 
23
  | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-4.0bpw-h8-exl2/) | 4.0 | 8 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 35GB | [0.0.2](https://github.com/turboderp/exllamav2/tree/a4f2663e310919f007c593030d56ca110f99c261) | 4096 | [FP16](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF) | 80 | 39GB | 44GB | 4.24609375 |
24
  | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-5.0bpw-h6-exl2/) | 5.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 44GB | [0.0.1](https://github.com/turboderp/exllamav2/tree/aee7a281708d5faff2ad0ea4b3a3a4b754f458f3) | 4096 | [FP16](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF) | 80 | 48GB | 52GB | 4.0625 |
25
- | _soon_ | 5.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | ..GB | [0.0.2](https://github.com/turboderp/exllamav2/tree/ec5164b8a8e282b91aedb2af94dfeb89887656b7) | 4096 | [BF16](https://huggingface.co/Thireus/WizardLM-70B-V1.0-BF16) | 80 | ..GB | ..GB | .. |
 
26
  | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-5.0bpw-h8-exl2/) | 5.0 | 8 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 44GB | [0.0.2](https://github.com/turboderp/exllamav2/tree/a4f2663e310919f007c593030d56ca110f99c261) | 4096 | [FP16](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF) | 80 | 48GB | 52GB | 4.09765625 |
27
  | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-6.0bpw-h6-exl2/) | 6.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 49GB | [0.0.2](https://github.com/turboderp/exllamav2/tree/fae6fb296c6db4e3b1314c49c030541bed98acb9) | 4096 | [FP16](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF) | 80 | 56GB | 60GB | 4.0703125 |
28
 
@@ -56,7 +58,7 @@ ASSISTANT:
56
 
57
  ## Quantization process:
58
 
59
- | Original Model | β†’ | (optional but recommended) float16 or bfloat16 Model* | β†’ | Safetensors Model** | β†’ | EXL2 Model |
60
  | -------------- | --- | ------------- | --- | ---------------- | --- | ---------- |
61
  | [WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0) | β†’ | [WizardLM 70B V1.0-HF](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF)* | β†’ | Safetensors** | β†’ | EXL2 |
62
 
 
10
 
11
  # WizardLM 70B V1.0 – EXL2
12
  - Model creator: [WizardLM](https://huggingface.co/WizardLM)
13
+ - FP32 Original model used for quantization: [WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0) – float32
14
  - FP16 Model used for quantization: [WizardLM 70B V1.0-HF](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF) – float16 of [WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0)
15
  - BF16 Model used for quantization: [WizardLM 70B V1.0-BF16](https://huggingface.co/Thireus/WizardLM-70B-V1.0-BF16) – bfloat16 of [WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0)
16
 
 
19
  | Link | BITS (-b) | HEAD BITS (-hb) | MEASU-REMENT LENGTH (-ml) | LENGTH (-l) | CAL DATASET (-c) | Size | V. | Max Context Length | Base Model | Layers | VRAM Min | VRAM Max | PPL**
20
  | ------ | --------- | --------------- | ------------------------ | ----------- | ---------------- | ---- | ------- | ------------------ | ---- | ---- |------------------ | ------------------ | ------------------ |
21
  | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-4.0bpw-h6-exl2/) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 35GB | [0.0.1](https://github.com/turboderp/exllamav2/tree/aee7a281708d5faff2ad0ea4b3a3a4b754f458f3) | 4096 | [FP16](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF) | 80 | 40GB | 44GB | 4.1640625 |
22
+ | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-BF16-4.0bpw-h6-exl2/) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 33GB | [0.0.2](https://github.com/turboderp/exllamav2/tree/ec5164b8a8e282b91aedb2af94dfeb89887656b7) | 4096 | [BF16](https://huggingface.co/Thireus/WizardLM-70B-V1.0-BF16) | 80 | 39GB | 44GB | 4.2421875 |
23
+ | _soon_ | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | ..GB | [0.0.2](https://github.com/turboderp/exllamav2/tree/c0dd3412d59c0bc776264512bf76264e954c221d) | 4096 | [FP32](https://huggingface.co/WizardLM/WizardLM-70B-V1.0) | 80 | ..GB | ..GB | |
24
  | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-4.0bpw-h8-exl2/) | 4.0 | 8 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 35GB | [0.0.2](https://github.com/turboderp/exllamav2/tree/a4f2663e310919f007c593030d56ca110f99c261) | 4096 | [FP16](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF) | 80 | 39GB | 44GB | 4.24609375 |
25
  | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-5.0bpw-h6-exl2/) | 5.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 44GB | [0.0.1](https://github.com/turboderp/exllamav2/tree/aee7a281708d5faff2ad0ea4b3a3a4b754f458f3) | 4096 | [FP16](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF) | 80 | 48GB | 52GB | 4.0625 |
26
+ | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-BF16-5.0bpw-h6-exl2/) | 5.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 41GB | [0.0.2](https://github.com/turboderp/exllamav2/tree/ec5164b8a8e282b91aedb2af94dfeb89887656b7) | 4096 | [BF16](https://huggingface.co/Thireus/WizardLM-70B-V1.0-BF16) | 80 | 48GB | 52GB | 4.09765625 |
27
+ | _soon_ | 5.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | ..GB | [0.0.2](https://github.com/turboderp/exllamav2/tree/c0dd3412d59c0bc776264512bf76264e954c221d) | 4096 | [FP32](https://huggingface.co/WizardLM/WizardLM-70B-V1.0) | 80 | ..GB | ..GB | |
28
  | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-5.0bpw-h8-exl2/) | 5.0 | 8 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 44GB | [0.0.2](https://github.com/turboderp/exllamav2/tree/a4f2663e310919f007c593030d56ca110f99c261) | 4096 | [FP16](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF) | 80 | 48GB | 52GB | 4.09765625 |
29
  | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-6.0bpw-h6-exl2/) | 6.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 49GB | [0.0.2](https://github.com/turboderp/exllamav2/tree/fae6fb296c6db4e3b1314c49c030541bed98acb9) | 4096 | [FP16](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF) | 80 | 56GB | 60GB | 4.0703125 |
30
 
 
58
 
59
  ## Quantization process:
60
 
61
+ | Original Model | β†’ | (optional) float16 or bfloat16 Model* | β†’ | Safetensors Model** | β†’ | EXL2 Model |
62
  | -------------- | --- | ------------- | --- | ---------------- | --- | ---------- |
63
  | [WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0) | β†’ | [WizardLM 70B V1.0-HF](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF)* | β†’ | Safetensors** | β†’ | EXL2 |
64