Text Generation
Safetensors
Russian
qwen2
conversational
File size: 19,144 Bytes
616a2e7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e7be766
616a2e7
 
 
 
 
 
 
 
 
e7be766
616a2e7
 
 
 
 
 
 
 
 
 
 
 
 
e7be766
616a2e7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e7be766
616a2e7
 
 
 
e7be766
616a2e7
 
 
 
 
 
 
 
 
e7be766
616a2e7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e7be766
616a2e7
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
INFO: 2024-11-26 20:17:25,790: llmtf.base.evaluator: Starting eval on ['darumeru/multiq', 'darumeru/parus', 'darumeru/rcb', 'darumeru/rwsd', 'darumeru/use']
INFO: 2024-11-26 20:17:25,791: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:17:25,791: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:17:27,525: llmtf.base.evaluator: Starting eval on ['nlpcoreteam/rummlu']
INFO: 2024-11-26 20:17:27,525: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:17:27,525: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:17:29,517: llmtf.base.evaluator: Starting eval on ['nlpcoreteam/enmmlu']
INFO: 2024-11-26 20:17:29,517: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:17:29,517: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:17:30,060: llmtf.base.darumeru/MultiQ: Loading Dataset: 4.27s
INFO: 2024-11-26 20:17:31,597: llmtf.base.evaluator: Starting eval on ['daru/treewayabstractive']
INFO: 2024-11-26 20:17:31,597: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:17:31,597: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:17:33,345: llmtf.base.evaluator: Starting eval on ['darumeru/cp_para_ru']
INFO: 2024-11-26 20:17:33,345: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:17:33,345: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:17:35,521: llmtf.base.evaluator: Starting eval on ['vikhrmodels/habr_qa_sbs', 'ruparam', 'shlepa/moviesmc', 'shlepa/musicmc', 'shlepa/lawmc', 'shlepa/booksmc']
INFO: 2024-11-26 20:17:35,521: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:17:35,521: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:17:36,308: llmtf.base.darumeru/cp_para_ru: Loading Dataset: 2.96s
INFO: 2024-11-26 20:17:36,742: llmtf.base.daru/treewayabstractive: Loading Dataset: 5.14s
INFO: 2024-11-26 20:17:37,659: llmtf.base.evaluator: Starting eval on ['ruopinionne']
INFO: 2024-11-26 20:17:37,659: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:17:37,660: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:17:37,984: llmtf.base.ruopinionne: Loading Dataset: 0.32s
INFO: 2024-11-26 20:17:39,551: llmtf.base.evaluator: Starting eval on ['nerel']
INFO: 2024-11-26 20:17:39,551: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:17:39,551: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:17:43,423: llmtf.base.NEREL: Loading Dataset: 3.87s
INFO: 2024-11-26 20:17:48,557: llmtf.base.vikhrmodels/habr_qa_sbs: Loading Dataset: 13.04s
INFO: 2024-11-26 20:18:44,135: llmtf.base.darumeru/cp_para_ru: Processing Dataset: 67.83s
INFO: 2024-11-26 20:18:44,145: llmtf.base.darumeru/cp_para_ru: Results for darumeru/cp_para_ru:
INFO: 2024-11-26 20:18:44,149: llmtf.base.darumeru/cp_para_ru: {'tokens_per_word': 1.905314928817744, 'symbol_per_token': 3.913951487866651, 'len': 0.9904780330832407, 'lcs': 0.8}
INFO: 2024-11-26 20:18:44,150: llmtf.base.evaluator: Ended eval
INFO: 2024-11-26 20:18:44,152: llmtf.base.evaluator: 
mean	darumeru/cp_para_ru
0.800	0.800
INFO: 2024-11-26 20:18:50,857: llmtf.base.NEREL: Processing Dataset: 67.43s
INFO: 2024-11-26 20:18:50,860: llmtf.base.NEREL: Results for NEREL:
INFO: 2024-11-26 20:18:50,864: llmtf.base.NEREL: {'tp': 2.0, 'fp': 27.0, 'fn': 519.0, 'micro-f1': 0.00727272727272595}
INFO: 2024-11-26 20:18:50,865: llmtf.base.evaluator: Ended eval
INFO: 2024-11-26 20:18:50,869: llmtf.base.evaluator: 
mean	NEREL	darumeru/cp_para_ru
0.404	0.007	0.800
INFO: 2024-11-26 20:18:57,627: llmtf.base.daru/treewayabstractive: Processing Dataset: 80.88s
INFO: 2024-11-26 20:18:57,629: llmtf.base.daru/treewayabstractive: Results for daru/treewayabstractive:
INFO: 2024-11-26 20:18:57,632: llmtf.base.daru/treewayabstractive: {'rouge1': 0.3138417117532064, 'rouge2': 0.10462617373556911}
INFO: 2024-11-26 20:18:57,634: llmtf.base.evaluator: Ended eval
INFO: 2024-11-26 20:18:57,637: llmtf.base.evaluator: 
mean	NEREL	daru/treewayabstractive	darumeru/cp_para_ru
0.339	0.007	0.209	0.800
INFO: 2024-11-26 20:19:13,379: llmtf.base.darumeru/MultiQ: Processing Dataset: 103.32s
INFO: 2024-11-26 20:19:13,384: llmtf.base.darumeru/MultiQ: Results for darumeru/MultiQ:
INFO: 2024-11-26 20:19:13,404: llmtf.base.darumeru/MultiQ: {'f1': 0.3016876852043635, 'em': 0.21319311663479923}
INFO: 2024-11-26 20:19:13,412: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:19:13,412: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:19:15,573: llmtf.base.darumeru/PARus: Loading Dataset: 2.16s
INFO: 2024-11-26 20:19:19,861: llmtf.base.darumeru/PARus: Processing Dataset: 4.29s
INFO: 2024-11-26 20:19:19,867: llmtf.base.darumeru/PARus: Results for darumeru/PARus:
INFO: 2024-11-26 20:19:19,887: llmtf.base.darumeru/PARus: {'acc': 0.44}
INFO: 2024-11-26 20:19:19,888: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:19:19,888: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:19:22,295: llmtf.base.darumeru/RCB: Loading Dataset: 2.40s
INFO: 2024-11-26 20:19:23,604: llmtf.base.ruopinionne: Processing Dataset: 105.62s
INFO: 2024-11-26 20:19:23,610: llmtf.base.ruopinionne: Results for ruopinionne:
INFO: 2024-11-26 20:19:23,639: llmtf.base.ruopinionne: {'f1': 0.02701209922104298}
INFO: 2024-11-26 20:19:23,640: llmtf.base.evaluator: Ended eval
INFO: 2024-11-26 20:19:23,656: llmtf.base.evaluator: 
mean	NEREL	daru/treewayabstractive	darumeru/MultiQ	darumeru/PARus	darumeru/cp_para_ru	ruopinionne
0.290	0.007	0.209	0.257	0.440	0.800	0.027
INFO: 2024-11-26 20:19:27,714: llmtf.base.darumeru/RCB: Processing Dataset: 5.42s
INFO: 2024-11-26 20:19:27,715: llmtf.base.darumeru/RCB: Results for darumeru/RCB:
INFO: 2024-11-26 20:19:27,722: llmtf.base.darumeru/RCB: {'acc': 0.4590909090909091, 'f1_macro': 0.36910715356478985}
INFO: 2024-11-26 20:19:27,724: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:19:27,724: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:19:29,690: llmtf.base.darumeru/RWSD: Loading Dataset: 1.96s
INFO: 2024-11-26 20:19:34,850: llmtf.base.darumeru/RWSD: Processing Dataset: 5.16s
INFO: 2024-11-26 20:19:34,855: llmtf.base.darumeru/RWSD: Results for darumeru/RWSD:
INFO: 2024-11-26 20:19:34,858: llmtf.base.darumeru/RWSD: {'acc': 0.49019607843137253}
INFO: 2024-11-26 20:19:34,859: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:19:34,859: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:19:38,002: llmtf.base.darumeru/USE: Loading Dataset: 3.14s
INFO: 2024-11-26 20:19:52,797: llmtf.base.nlpcoreteam/enMMLU: Loading Dataset: 143.28s
INFO: 2024-11-26 20:19:55,208: llmtf.base.nlpcoreteam/ruMMLU: Loading Dataset: 147.68s
INFO: 2024-11-26 20:20:33,495: llmtf.base.vikhrmodels/habr_qa_sbs: Processing Dataset: 164.94s
INFO: 2024-11-26 20:20:33,496: llmtf.base.vikhrmodels/habr_qa_sbs: Results for vikhrmodels/habr_qa_sbs:
INFO: 2024-11-26 20:20:33,533: llmtf.base.vikhrmodels/habr_qa_sbs: {'acc': 0.547, 'f1_macro': 0.5280856541414141}
INFO: 2024-11-26 20:20:33,546: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:20:33,547: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:20:40,434: llmtf.base.ruparam: Loading Dataset: 6.89s
INFO: 2024-11-26 20:21:07,899: llmtf.base.darumeru/USE: Processing Dataset: 89.90s
INFO: 2024-11-26 20:21:07,900: llmtf.base.darumeru/USE: Results for darumeru/USE:
INFO: 2024-11-26 20:21:07,942: llmtf.base.darumeru/USE: {'grade_norm': 0.06078431372549018}
INFO: 2024-11-26 20:21:07,948: llmtf.base.evaluator: Ended eval
INFO: 2024-11-26 20:21:08,016: llmtf.base.evaluator: 
mean	NEREL	daru/treewayabstractive	darumeru/MultiQ	darumeru/PARus	darumeru/RCB	darumeru/RWSD	darumeru/USE	darumeru/cp_para_ru	ruopinionne	vikhrmodels/habr_qa_sbs
0.324	0.007	0.209	0.257	0.440	0.414	0.490	0.061	0.800	0.027	0.538
INFO: 2024-11-26 20:25:43,772: llmtf.base.ruparam: Processing Dataset: 303.34s
INFO: 2024-11-26 20:25:43,788: llmtf.base.ruparam: Results for ruparam:
INFO: 2024-11-26 20:25:44,038: llmtf.base.ruparam: {'acc': 0.21363220494053065}
INFO: 2024-11-26 20:25:44,053: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:25:44,053: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:25:47,874: llmtf.base.shlepa/movie_mc: Loading Dataset: 3.82s
INFO: 2024-11-26 20:26:05,103: llmtf.base.shlepa/movie_mc: Processing Dataset: 17.22s
INFO: 2024-11-26 20:26:05,119: llmtf.base.shlepa/movie_mc: Results for shlepa/movie_mc:
INFO: 2024-11-26 20:26:05,122: llmtf.base.shlepa/movie_mc: {'acc': 0.22453703703703703}
INFO: 2024-11-26 20:26:05,130: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:26:05,130: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:26:08,736: llmtf.base.shlepa/music_mc: Loading Dataset: 3.60s
INFO: 2024-11-26 20:26:26,413: llmtf.base.shlepa/music_mc: Processing Dataset: 17.68s
INFO: 2024-11-26 20:26:26,416: llmtf.base.shlepa/music_mc: Results for shlepa/music_mc:
INFO: 2024-11-26 20:26:26,435: llmtf.base.shlepa/music_mc: {'acc': 0.24468085106382978}
INFO: 2024-11-26 20:26:26,438: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:26:26,439: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:26:31,058: llmtf.base.shlepa/law_mc: Loading Dataset: 4.62s
INFO: 2024-11-26 20:27:15,973: llmtf.base.shlepa/law_mc: Processing Dataset: 44.91s
INFO: 2024-11-26 20:27:15,980: llmtf.base.shlepa/law_mc: Results for shlepa/law_mc:
INFO: 2024-11-26 20:27:16,000: llmtf.base.shlepa/law_mc: {'acc': 0.537590113285273}
INFO: 2024-11-26 20:27:16,006: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [145111]
INFO: 2024-11-26 20:27:16,006: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['<|im_end|>']
INFO: 2024-11-26 20:27:19,668: llmtf.base.shlepa/books_mc: Loading Dataset: 3.66s
INFO: 2024-11-26 20:27:39,200: llmtf.base.shlepa/books_mc: Processing Dataset: 19.53s
INFO: 2024-11-26 20:27:39,204: llmtf.base.shlepa/books_mc: Results for shlepa/books_mc:
INFO: 2024-11-26 20:27:39,209: llmtf.base.shlepa/books_mc: {'acc': 0.3112033195020747}
INFO: 2024-11-26 20:27:39,212: llmtf.base.evaluator: Ended eval
INFO: 2024-11-26 20:27:39,231: llmtf.base.evaluator: 
mean	NEREL	daru/treewayabstractive	darumeru/MultiQ	darumeru/PARus	darumeru/RCB	darumeru/RWSD	darumeru/USE	darumeru/cp_para_ru	ruopinionne	ruparam	shlepa/books_mc	shlepa/law_mc	shlepa/movie_mc	shlepa/music_mc	vikhrmodels/habr_qa_sbs
0.318	0.007	0.209	0.257	0.440	0.414	0.490	0.061	0.800	0.027	0.214	0.311	0.538	0.225	0.245	0.538
INFO: 2024-11-26 20:28:00,389: llmtf.base.nlpcoreteam/enMMLU: Processing Dataset: 487.59s
INFO: 2024-11-26 20:28:00,396: llmtf.base.nlpcoreteam/enMMLU: Results for nlpcoreteam/enMMLU:
INFO: 2024-11-26 20:28:00,445: llmtf.base.nlpcoreteam/enMMLU:                                        metric
subject                                      
abstract_algebra                     0.310000
anatomy                              0.518519
astronomy                            0.717105
business_ethics                      0.630000
clinical_knowledge                   0.667925
college_biology                      0.680556
college_chemistry                    0.400000
college_computer_science             0.510000
college_mathematics                  0.270000
college_medicine                     0.618497
college_physics                      0.558824
computer_security                    0.750000
conceptual_physics                   0.561702
econometrics                         0.438596
electrical_engineering               0.579310
elementary_mathematics               0.455026
formal_logic                         0.412698
global_facts                         0.250000
high_school_biology                  0.738710
high_school_chemistry                0.541872
high_school_computer_science         0.650000
high_school_european_history         0.751515
high_school_geography                0.767677
high_school_government_and_politics  0.808290
high_school_macroeconomics           0.638462
high_school_mathematics              0.433333
high_school_microeconomics           0.676471
high_school_physics                  0.377483
high_school_psychology               0.814679
high_school_statistics               0.527778
high_school_us_history               0.715686
high_school_world_history            0.767932
human_aging                          0.641256
human_sexuality                      0.679389
international_law                    0.735537
jurisprudence                        0.777778
logical_fallacies                    0.760736
machine_learning                     0.455357
management                           0.757282
marketing                            0.837607
medical_genetics                     0.690000
miscellaneous                        0.713921
moral_disputes                       0.650289
moral_scenarios                      0.243575
nutrition                            0.663399
philosophy                           0.668810
prehistory                           0.682099
professional_accounting              0.489362
professional_law                     0.418514
professional_medicine                0.599265
professional_psychology              0.588235
public_relations                     0.600000
security_studies                     0.689796
sociology                            0.786070
us_foreign_policy                    0.780000
virology                             0.463855
world_religions                      0.789474
INFO: 2024-11-26 20:28:00,453: llmtf.base.nlpcoreteam/enMMLU:                                    metric
subject                                  
STEM                             0.528725
humanities                       0.644203
other (business, health, misc.)  0.610063
social sciences                  0.688972
INFO: 2024-11-26 20:28:00,464: llmtf.base.nlpcoreteam/enMMLU: {'acc': 0.6179910195383496}
INFO: 2024-11-26 20:28:00,501: llmtf.base.evaluator: Ended eval
INFO: 2024-11-26 20:28:00,524: llmtf.base.evaluator: 
mean	NEREL	daru/treewayabstractive	darumeru/MultiQ	darumeru/PARus	darumeru/RCB	darumeru/RWSD	darumeru/USE	darumeru/cp_para_ru	nlpcoreteam/enMMLU	ruopinionne	ruparam	shlepa/books_mc	shlepa/law_mc	shlepa/movie_mc	shlepa/music_mc	vikhrmodels/habr_qa_sbs
0.337	0.007	0.209	0.257	0.440	0.414	0.490	0.061	0.800	0.618	0.027	0.214	0.311	0.538	0.225	0.245	0.538
INFO: 2024-11-26 20:28:16,900: llmtf.base.nlpcoreteam/ruMMLU: Processing Dataset: 501.69s
INFO: 2024-11-26 20:28:16,902: llmtf.base.nlpcoreteam/ruMMLU: Results for nlpcoreteam/ruMMLU:
INFO: 2024-11-26 20:28:16,950: llmtf.base.nlpcoreteam/ruMMLU:                                        metric
subject                                      
abstract_algebra                     0.320000
anatomy                              0.414815
astronomy                            0.572368
business_ethics                      0.450000
clinical_knowledge                   0.505660
college_biology                      0.375000
college_chemistry                    0.310000
college_computer_science             0.380000
college_mathematics                  0.350000
college_medicine                     0.508671
college_physics                      0.431373
computer_security                    0.530000
conceptual_physics                   0.429787
econometrics                         0.298246
electrical_engineering               0.448276
elementary_mathematics               0.417989
formal_logic                         0.365079
global_facts                         0.240000
high_school_biology                  0.487097
high_school_chemistry                0.443350
high_school_computer_science         0.530000
high_school_european_history         0.654545
high_school_geography                0.525253
high_school_government_and_politics  0.481865
high_school_macroeconomics           0.430769
high_school_mathematics              0.392593
high_school_microeconomics           0.441176
high_school_physics                  0.291391
high_school_psychology               0.572477
high_school_statistics               0.416667
high_school_us_history               0.495098
high_school_world_history            0.632911
human_aging                          0.488789
human_sexuality                      0.496183
international_law                    0.677686
jurisprudence                        0.574074
logical_fallacies                    0.441718
machine_learning                     0.321429
management                           0.563107
marketing                            0.722222
medical_genetics                     0.470000
miscellaneous                        0.536398
moral_disputes                       0.517341
moral_scenarios                      0.237989
nutrition                            0.526144
philosophy                           0.546624
prehistory                           0.478395
professional_accounting              0.365248
professional_law                     0.331160
professional_medicine                0.367647
professional_psychology              0.415033
public_relations                     0.454545
security_studies                     0.595918
sociology                            0.631841
us_foreign_policy                    0.710000
virology                             0.409639
world_religions                      0.549708
INFO: 2024-11-26 20:28:16,959: llmtf.base.nlpcoreteam/ruMMLU:                                    metric
subject                                  
STEM                             0.413740
humanities                       0.500179
other (business, health, misc.)  0.469167
social sciences                  0.504442
INFO: 2024-11-26 20:28:16,967: llmtf.base.nlpcoreteam/ruMMLU: {'acc': 0.47188210698069283}
INFO: 2024-11-26 20:28:17,010: llmtf.base.evaluator: Ended eval
INFO: 2024-11-26 20:28:17,020: llmtf.base.evaluator: 
mean	NEREL	daru/treewayabstractive	darumeru/MultiQ	darumeru/PARus	darumeru/RCB	darumeru/RWSD	darumeru/USE	darumeru/cp_para_ru	nlpcoreteam/enMMLU	nlpcoreteam/ruMMLU	ruopinionne	ruparam	shlepa/books_mc	shlepa/law_mc	shlepa/movie_mc	shlepa/music_mc	vikhrmodels/habr_qa_sbs
0.345	0.007	0.209	0.257	0.440	0.414	0.490	0.061	0.800	0.618	0.472	0.027	0.214	0.311	0.538	0.225	0.245	0.538