bigmoyan commited on
Commit
060055d
1 Parent(s): b977cd1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -26,6 +26,11 @@ Among its main features are:
26
  - device: Nvidia A100 40G
27
  - batch size: 8
28
 
 
 
 
 
 
29
  |version|speed|
30
  |:-:|:-:|
31
  |original|30 tokens/s|
 
26
  - device: Nvidia A100 40G
27
  - batch size: 8
28
 
29
+ **Since early chatGLM version dosen't suport batch inference, `original` in below table is measured on batch_size=1**
30
+
31
+
32
+ **According to [this discussion](https://huggingface.co/TMElyralab/lyraChatGLM/discussions/6), this bug has been fixed and the speed on batch_size=8 reachs up to 137 tokens/s**
33
+
34
  |version|speed|
35
  |:-:|:-:|
36
  |original|30 tokens/s|