mjdousti commited on
Commit
da88011
·
1 Parent(s): beb0d76

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -29,7 +29,7 @@ co2_eq_emissions:
29
  # <span style="font-variant:small-caps;">PersianMind</span>
30
 
31
  <span style="font-variant:small-caps;">PersianMind</span> is a cross-lingual Persian-English large language model.
32
- The model achieves state-of-the-art results on Persian subset of the [Belebele](https://github.com/facebookresearch/belebele) benchmark
33
  and the [ParsiNLU multiple-choice QA](https://github.com/persiannlp/parsinlu) task.
34
  It also attains performance comparable to GPT-3.5-turbo in a Persian reading comprehension task.
35
 
@@ -111,15 +111,15 @@ model = LlamaForCausalLM.from_pretrained(
111
 
112
  ### Evaluating Quantized Models
113
 
114
- | Model | Belebele (Persian) | Fa→En Translation | En→Fa Translation | Model Size | Tokens/sec |
115
- | :----------------------------------------------------------------- | :----------------: | :---------------: | :---------------: | :--------: | :--------: |
116
- | <span style="font-variant:small-caps;">PersianMind</span> (`bf16`) | 73.9 | 83.61 | 79.44 | 13.7G | 25.35 |
117
- | <span style="font-variant:small-caps;">PersianMind</span> (`INT8`) | 73.7 | 82.32 | 78.61 | 7.2G | 11.36 |
118
- | <span style="font-variant:small-caps;">PersianMind</span> (`INT4`) | 70.2 | 82.07 | 80.36 | 3.9G | 24.36 |
119
 
120
  We evaluated quantized models in various tasks against the original model.
121
  Specifically, we evaluated all models using the reading comprehension multiple-choice
122
- question-answering benchmark of [Belebele](https://github.com/facebookresearch/belebele) (Persian subset) and reported the accuracy of each model.
123
  Additionally, we evaluated our models for Persian-to-English and English-to-Persian translation tasks.
124
  For this, we utilized the Persian-English subset of the [Flores-200](https://github.com/facebookresearch/flores/tree/main/flores200) dataset and
125
  reported our results using the <span style="font-variant:small-caps;">Comet</span> metric.
 
29
  # <span style="font-variant:small-caps;">PersianMind</span>
30
 
31
  <span style="font-variant:small-caps;">PersianMind</span> is a cross-lingual Persian-English large language model.
32
+ The model achieves state-of-the-art results on Persian subset of the [<span style="font-variant:small-caps;">Belebele</span>](https://github.com/facebookresearch/belebele) benchmark
33
  and the [ParsiNLU multiple-choice QA](https://github.com/persiannlp/parsinlu) task.
34
  It also attains performance comparable to GPT-3.5-turbo in a Persian reading comprehension task.
35
 
 
111
 
112
  ### Evaluating Quantized Models
113
 
114
+ | Model | <span style="font-variant:small-caps;">Belebele</span> (Persian) | Fa→En Translation | En→Fa Translation | Model Size | Tokens/sec |
115
+ | :----------------------------------------------------------------- | :--------------------------------------------------------------: | :---------------: | :---------------: | :--------: | :--------: |
116
+ | <span style="font-variant:small-caps;">PersianMind</span> (`bf16`) | 73.9 | 83.61 | 79.44 | 13.7G | 25.35 |
117
+ | <span style="font-variant:small-caps;">PersianMind</span> (`INT8`) | 73.7 | 82.32 | 78.61 | 7.2G | 11.36 |
118
+ | <span style="font-variant:small-caps;">PersianMind</span> (`INT4`) | 70.2 | 82.07 | 80.36 | 3.9G | 24.36 |
119
 
120
  We evaluated quantized models in various tasks against the original model.
121
  Specifically, we evaluated all models using the reading comprehension multiple-choice
122
+ question-answering benchmark of [<span style="font-variant:small-caps;">Belebele</span>](https://github.com/facebookresearch/belebele) (Persian subset) and reported the accuracy of each model.
123
  Additionally, we evaluated our models for Persian-to-English and English-to-Persian translation tasks.
124
  For this, we utilized the Persian-English subset of the [Flores-200](https://github.com/facebookresearch/flores/tree/main/flores200) dataset and
125
  reported our results using the <span style="font-variant:small-caps;">Comet</span> metric.