nm-research commited on
Commit
554cf8b
1 Parent(s): ff98444

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -31,7 +31,7 @@ It achieves an average score of 58.34 on the [OpenLLM](https://huggingface.co/sp
31
 
32
  ### Model Optimizations
33
 
34
- This model was obtained by quantizing the weights of [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B) to INT8 data type.
35
  This optimization reduces the number of bits used to represent weights and activations from 16 to 8, reducing GPU memory requirements (by approximately 50%) and increasing matrix-multiply compute throughput (by approximately 2x).
36
  Weight quantization also reduces disk size requirements by approximately 50%.
37
 
 
31
 
32
  ### Model Optimizations
33
 
34
+ This model was obtained by quantizing the weights and activations of [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B) to INT8 data type.
35
  This optimization reduces the number of bits used to represent weights and activations from 16 to 8, reducing GPU memory requirements (by approximately 50%) and increasing matrix-multiply compute throughput (by approximately 2x).
36
  Weight quantization also reduces disk size requirements by approximately 50%.
37