leaderboard-pt-pr-bot commited on
Commit
aa8a51e
1 Parent(s): a253df1

Adding the Open Portuguese LLM Leaderboard Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard

The purpose of this PR is to add evaluation results from the Open Portuguese LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show
  1. README.md +147 -9
README.md CHANGED
@@ -1,18 +1,139 @@
1
  ---
2
  language:
3
- - pt
4
- - en
5
  license: cc
6
  tags:
7
- - text-generation-inference
8
- - transformers
9
- - qwen
10
- - gguf
11
- - brazil
12
- - brasil
13
- - portuguese
14
  base_model: Qwen/Qwen1.5-7B-Chat
15
  pipeline_tag: text-generation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  ---
17
  # Cabra Qwen 7b
18
  <img src="https://uploads-ssl.webflow.com/65f77c0240ae1c68f8192771/660b1a4df7de79066317cafe_cabra2.png" width="400" height="400">
@@ -166,3 +287,20 @@ O modelo é destinado, por agora, a fins de pesquisa. As áreas e tarefas de pes
166
  | | | exam_id__2015-16 | 3 | acc | 0.4125 | ± 0.0318 |
167
  | portuguese_hate_speech_binary | 1.0 | all | 25 | f1_macro | 0.6969 | ± 0.0119 |
168
  | | | all | 25 | acc | 0.7356 | ± 0.0107 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  language:
3
+ - pt
4
+ - en
5
  license: cc
6
  tags:
7
+ - text-generation-inference
8
+ - transformers
9
+ - qwen
10
+ - gguf
11
+ - brazil
12
+ - brasil
13
+ - portuguese
14
  base_model: Qwen/Qwen1.5-7B-Chat
15
  pipeline_tag: text-generation
16
+ model-index:
17
+ - name: CabraQwen7b
18
+ results:
19
+ - task:
20
+ type: text-generation
21
+ name: Text Generation
22
+ dataset:
23
+ name: ENEM Challenge (No Images)
24
+ type: eduagarcia/enem_challenge
25
+ split: train
26
+ args:
27
+ num_few_shot: 3
28
+ metrics:
29
+ - type: acc
30
+ value: 69.21
31
+ name: accuracy
32
+ source:
33
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/CabraQwen7b
34
+ name: Open Portuguese LLM Leaderboard
35
+ - task:
36
+ type: text-generation
37
+ name: Text Generation
38
+ dataset:
39
+ name: BLUEX (No Images)
40
+ type: eduagarcia-temp/BLUEX_without_images
41
+ split: train
42
+ args:
43
+ num_few_shot: 3
44
+ metrics:
45
+ - type: acc
46
+ value: 56.05
47
+ name: accuracy
48
+ source:
49
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/CabraQwen7b
50
+ name: Open Portuguese LLM Leaderboard
51
+ - task:
52
+ type: text-generation
53
+ name: Text Generation
54
+ dataset:
55
+ name: OAB Exams
56
+ type: eduagarcia/oab_exams
57
+ split: train
58
+ args:
59
+ num_few_shot: 3
60
+ metrics:
61
+ - type: acc
62
+ value: 43.23
63
+ name: accuracy
64
+ source:
65
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/CabraQwen7b
66
+ name: Open Portuguese LLM Leaderboard
67
+ - task:
68
+ type: text-generation
69
+ name: Text Generation
70
+ dataset:
71
+ name: Assin2 RTE
72
+ type: assin2
73
+ split: test
74
+ args:
75
+ num_few_shot: 15
76
+ metrics:
77
+ - type: f1_macro
78
+ value: 88.52
79
+ name: f1-macro
80
+ - type: pearson
81
+ value: 76.17
82
+ name: pearson
83
+ source:
84
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/CabraQwen7b
85
+ name: Open Portuguese LLM Leaderboard
86
+ - task:
87
+ type: text-generation
88
+ name: Text Generation
89
+ dataset:
90
+ name: FaQuAD NLI
91
+ type: ruanchaves/faquad-nli
92
+ split: test
93
+ args:
94
+ num_few_shot: 15
95
+ metrics:
96
+ - type: f1_macro
97
+ value: 57.8
98
+ name: f1-macro
99
+ source:
100
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/CabraQwen7b
101
+ name: Open Portuguese LLM Leaderboard
102
+ - task:
103
+ type: text-generation
104
+ name: Text Generation
105
+ dataset:
106
+ name: HateBR Binary
107
+ type: eduagarcia/portuguese_benchmark
108
+ split: test
109
+ args:
110
+ num_few_shot: 25
111
+ metrics:
112
+ - type: f1_macro
113
+ value: 76.32
114
+ name: f1-macro
115
+ - type: f1_macro
116
+ value: 69.69
117
+ name: f1-macro
118
+ source:
119
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/CabraQwen7b
120
+ name: Open Portuguese LLM Leaderboard
121
+ - task:
122
+ type: text-generation
123
+ name: Text Generation
124
+ dataset:
125
+ name: tweetSentBR
126
+ type: eduagarcia-temp/tweetsentbr
127
+ split: test
128
+ args:
129
+ num_few_shot: 25
130
+ metrics:
131
+ - type: f1_macro
132
+ value: 65.96
133
+ name: f1-macro
134
+ source:
135
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=nicolasdec/CabraQwen7b
136
+ name: Open Portuguese LLM Leaderboard
137
  ---
138
  # Cabra Qwen 7b
139
  <img src="https://uploads-ssl.webflow.com/65f77c0240ae1c68f8192771/660b1a4df7de79066317cafe_cabra2.png" width="400" height="400">
 
287
  | | | exam_id__2015-16 | 3 | acc | 0.4125 | ± 0.0318 |
288
  | portuguese_hate_speech_binary | 1.0 | all | 25 | f1_macro | 0.6969 | ± 0.0119 |
289
  | | | all | 25 | acc | 0.7356 | ± 0.0107 |
290
+
291
+ # [Open Portuguese LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
292
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/nicolasdec/CabraQwen7b)
293
+
294
+ | Metric | Value |
295
+ |--------------------------|---------|
296
+ |Average |**66.99**|
297
+ |ENEM Challenge (No Images)| 69.21|
298
+ |BLUEX (No Images) | 56.05|
299
+ |OAB Exams | 43.23|
300
+ |Assin2 RTE | 88.52|
301
+ |Assin2 STS | 76.17|
302
+ |FaQuAD NLI | 57.80|
303
+ |HateBR Binary | 76.32|
304
+ |PT Hate Speech Binary | 69.69|
305
+ |tweetSentBR | 65.96|
306
+