Commit
2f5baf1
1 Parent(s): a253df1

Adding the Open Portuguese LLM Leaderboard Evaluation Results (#1)

Browse files

- Adding the Open Portuguese LLM Leaderboard Evaluation Results (aa8a51e254387ed99b1e04f20d72d5da7e8d9900)


Co-authored-by: Open PT LLM Leaderboard PR Bot <leaderboard-pt-pr-bot@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +147 -9
README.md CHANGED
@@ -1,18 +1,139 @@
1
  ---
2
  language:
3
- - pt
4
- - en
5
  license: cc
6
  tags:
7
- - text-generation-inference
8
- - transformers
9
- - qwen
10
- - gguf
11
- - brazil
12
- - brasil
13
- - portuguese
14
  base_model: Qwen/Qwen1.5-7B-Chat
15
  pipeline_tag: text-generation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  ---
17
  # Cabra Qwen 7b
18
  <img src="https://uploads-ssl.webflow.com/65f77c0240ae1c68f8192771/660b1a4df7de79066317cafe_cabra2.png" width="400" height="400">
@@ -166,3 +287,20 @@ O modelo é destinado, por agora, a fins de pesquisa. As áreas e tarefas de pes
166
  | | | exam_id__2015-16 | 3 | acc | 0.4125 | ± 0.0318 |
167
  | portuguese_hate_speech_binary | 1.0 | all | 25 | f1_macro | 0.6969 | ± 0.0119 |
168
  | | | all | 25 | acc | 0.7356 | ± 0.0107 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  language:
3
+ - pt
4
+ - en
5
  license: cc
6
  tags:
7
+ - text-generation-inference
8
+ - transformers
9
+ - qwen
10
+ - gguf
11
+ - brazil
12
+ - brasil
13
+ - portuguese
14
  base_model: Qwen/Qwen1.5-7B-Chat
15
  pipeline_tag: text-generation
16
+ model-index:
17
+ - name: CabraQwen7b
18
+ results:
19
+ - task:
20
+ type: text-generation
21
+ name: Text Generation
22
+ dataset:
23
+ name: ENEM Challenge (No Images)
24
+ type: eduagarcia/enem_challenge
25
+ split: train
26
+ args:
27
+ num_few_shot: 3
28
+ metrics:
29
+ - type: acc
30
+ value: 69.21
31
+ name: accuracy
32
+ source:
33
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/CabraQwen7b
34
+ name: Open Portuguese LLM Leaderboard
35
+ - task:
36
+ type: text-generation
37
+ name: Text Generation
38
+ dataset:
39
+ name: BLUEX (No Images)
40
+ type: eduagarcia-temp/BLUEX_without_images
41
+ split: train
42
+ args:
43
+ num_few_shot: 3
44
+ metrics:
45
+ - type: acc
46
+ value: 56.05
47
+ name: accuracy
48
+ source:
49
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/CabraQwen7b
50
+ name: Open Portuguese LLM Leaderboard
51
+ - task:
52
+ type: text-generation
53
+ name: Text Generation
54
+ dataset:
55
+ name: OAB Exams
56
+ type: eduagarcia/oab_exams
57
+ split: train
58
+ args:
59
+ num_few_shot: 3
60
+ metrics:
61
+ - type: acc
62
+ value: 43.23
63
+ name: accuracy
64
+ source:
65
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/CabraQwen7b
66
+ name: Open Portuguese LLM Leaderboard
67
+ - task:
68
+ type: text-generation
69
+ name: Text Generation
70
+ dataset:
71
+ name: Assin2 RTE
72
+ type: assin2
73
+ split: test
74
+ args:
75
+ num_few_shot: 15
76
+ metrics:
77
+ - type: f1_macro
78
+ value: 88.52
79
+ name: f1-macro
80
+ - type: pearson
81
+ value: 76.17
82
+ name: pearson
83
+ source:
84
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/CabraQwen7b
85
+ name: Open Portuguese LLM Leaderboard
86
+ - task:
87
+ type: text-generation
88
+ name: Text Generation
89
+ dataset:
90
+ name: FaQuAD NLI
91
+ type: ruanchaves/faquad-nli
92
+ split: test
93
+ args:
94
+ num_few_shot: 15
95
+ metrics:
96
+ - type: f1_macro
97
+ value: 57.8
98
+ name: f1-macro
99
+ source:
100
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/CabraQwen7b
101
+ name: Open Portuguese LLM Leaderboard
102
+ - task:
103
+ type: text-generation
104
+ name: Text Generation
105
+ dataset:
106
+ name: HateBR Binary
107
+ type: eduagarcia/portuguese_benchmark
108
+ split: test
109
+ args:
110
+ num_few_shot: 25
111
+ metrics:
112
+ - type: f1_macro
113
+ value: 76.32
114
+ name: f1-macro
115
+ - type: f1_macro
116
+ value: 69.69
117
+ name: f1-macro
118
+ source:
119
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/CabraQwen7b
120
+ name: Open Portuguese LLM Leaderboard
121
+ - task:
122
+ type: text-generation
123
+ name: Text Generation
124
+ dataset:
125
+ name: tweetSentBR
126
+ type: eduagarcia-temp/tweetsentbr
127
+ split: test
128
+ args:
129
+ num_few_shot: 25
130
+ metrics:
131
+ - type: f1_macro
132
+ value: 65.96
133
+ name: f1-macro
134
+ source:
135
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/CabraQwen7b
136
+ name: Open Portuguese LLM Leaderboard
137
  ---
138
  # Cabra Qwen 7b
139
  <img src="https://uploads-ssl.webflow.com/65f77c0240ae1c68f8192771/660b1a4df7de79066317cafe_cabra2.png" width="400" height="400">
 
287
  | | | exam_id__2015-16 | 3 | acc | 0.4125 | ± 0.0318 |
288
  | portuguese_hate_speech_binary | 1.0 | all | 25 | f1_macro | 0.6969 | ± 0.0119 |
289
  | | | all | 25 | acc | 0.7356 | ± 0.0107 |
290
+
291
+ # [Open Portuguese LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
292
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/nicolasdec/CabraQwen7b)
293
+
294
+ | Metric | Value |
295
+ |--------------------------|---------|
296
+ |Average |**66.99**|
297
+ |ENEM Challenge (No Images)| 69.21|
298
+ |BLUEX (No Images) | 56.05|
299
+ |OAB Exams | 43.23|
300
+ |Assin2 RTE | 88.52|
301
+ |Assin2 STS | 76.17|
302
+ |FaQuAD NLI | 57.80|
303
+ |HateBR Binary | 76.32|
304
+ |PT Hate Speech Binary | 69.69|
305
+ |tweetSentBR | 65.96|
306
+