leaderboard-pr-bot commited on
Commit
2f51a02
1 Parent(s): 619be17

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +138 -32
README.md CHANGED
@@ -1,43 +1,136 @@
1
  ---
 
 
2
  license: apache-2.0
3
- base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
4
  library_name: peft
5
  tags:
6
  - llama-factory
7
  - lora
8
  - not-for-all-audiences
9
  - nsfw
 
10
  datasets:
11
- - Nekochu/Luminia-mixture
12
- - UnfilteredAI/DAN
13
- # psy_mental_health.json Luminia-mixture dataset;
14
- - mpingale/mental-health-chat-dataset
15
- - Amod/mental_health_counseling_conversations
16
- - heliosbrahma/mental_health_chatbot_dataset
17
- - victunes/nart-100k-synthetic-buddy-mixed-names
18
- - Falah/Mental_health_dataset4Fine_Tuning
19
- - EmoCareAI/Psych8k
20
- - samhog/psychology-10k
21
- # Lumimaid-v0.2 (Lumimaid-v2.json) dataset:
22
- - Doctor-Shotgun/no-robots-sharegpt
23
- - Gryphe/Opus-WritingPrompts
24
- - NobodyExistsOnTheInternet/ToxicQAFinal
25
- - meseca/opus-instruct-9k
26
- - PJMixers/grimulkan_theory-of-mind-ShareGPT
27
- - CapybaraPure/Decontaminated-ShareGPT
28
- - MinervaAI/Aesir-Preview
29
- - Epiculous/Gnosis
30
- - Norquinal/claude_multiround_chat_30k
31
- - Locutusque/hercules-v5.0
32
- - G-reen/Duet-v0.5
33
- - cgato/SlimOrcaDedupCleaned
34
- - Gryphe/Sonnet3.5-SlimOrcaDedupCleaned
35
- - ChaoticNeutrals/Synthetic-Dark-RP
36
- - ChaoticNeutrals/Synthetic-RP
37
- - ChaoticNeutrals/Luminous_Opus
38
- - kalomaze/Opus_Instruct_25k
39
- language:
40
- - en
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
  ---
42
 
43
  Fine-tuning of ‘Llama-3.1-8B’ with a focus on RP and uncensored.
@@ -232,4 +325,17 @@ Trying something new can be intimidating, but it can also be a great opportunity
232
 
233
  - To get an idea of the data portions: `Lumimaid-v2`: 50% (600MB); `psy_mental_health`: 30%; `faproulette_co-OCR-fix-gpt4o_qa_fixer`: 5%.
234
 
235
- </details>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
  license: apache-2.0
 
5
  library_name: peft
6
  tags:
7
  - llama-factory
8
  - lora
9
  - not-for-all-audiences
10
  - nsfw
11
+ base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
12
  datasets:
13
+ - Nekochu/Luminia-mixture
14
+ - UnfilteredAI/DAN
15
+ - mpingale/mental-health-chat-dataset
16
+ - Amod/mental_health_counseling_conversations
17
+ - heliosbrahma/mental_health_chatbot_dataset
18
+ - victunes/nart-100k-synthetic-buddy-mixed-names
19
+ - Falah/Mental_health_dataset4Fine_Tuning
20
+ - EmoCareAI/Psych8k
21
+ - samhog/psychology-10k
22
+ - Doctor-Shotgun/no-robots-sharegpt
23
+ - Gryphe/Opus-WritingPrompts
24
+ - NobodyExistsOnTheInternet/ToxicQAFinal
25
+ - meseca/opus-instruct-9k
26
+ - PJMixers/grimulkan_theory-of-mind-ShareGPT
27
+ - CapybaraPure/Decontaminated-ShareGPT
28
+ - MinervaAI/Aesir-Preview
29
+ - Epiculous/Gnosis
30
+ - Norquinal/claude_multiround_chat_30k
31
+ - Locutusque/hercules-v5.0
32
+ - G-reen/Duet-v0.5
33
+ - cgato/SlimOrcaDedupCleaned
34
+ - Gryphe/Sonnet3.5-SlimOrcaDedupCleaned
35
+ - ChaoticNeutrals/Synthetic-Dark-RP
36
+ - ChaoticNeutrals/Synthetic-RP
37
+ - ChaoticNeutrals/Luminous_Opus
38
+ - kalomaze/Opus_Instruct_25k
39
+ model-index:
40
+ - name: Luminia-8B-RP
41
+ results:
42
+ - task:
43
+ type: text-generation
44
+ name: Text Generation
45
+ dataset:
46
+ name: IFEval (0-Shot)
47
+ type: HuggingFaceH4/ifeval
48
+ args:
49
+ num_few_shot: 0
50
+ metrics:
51
+ - type: inst_level_strict_acc and prompt_level_strict_acc
52
+ value: 55.74
53
+ name: strict accuracy
54
+ source:
55
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Nekochu/Luminia-8B-RP
56
+ name: Open LLM Leaderboard
57
+ - task:
58
+ type: text-generation
59
+ name: Text Generation
60
+ dataset:
61
+ name: BBH (3-Shot)
62
+ type: BBH
63
+ args:
64
+ num_few_shot: 3
65
+ metrics:
66
+ - type: acc_norm
67
+ value: 31.8
68
+ name: normalized accuracy
69
+ source:
70
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Nekochu/Luminia-8B-RP
71
+ name: Open LLM Leaderboard
72
+ - task:
73
+ type: text-generation
74
+ name: Text Generation
75
+ dataset:
76
+ name: MATH Lvl 5 (4-Shot)
77
+ type: hendrycks/competition_math
78
+ args:
79
+ num_few_shot: 4
80
+ metrics:
81
+ - type: exact_match
82
+ value: 11.71
83
+ name: exact match
84
+ source:
85
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Nekochu/Luminia-8B-RP
86
+ name: Open LLM Leaderboard
87
+ - task:
88
+ type: text-generation
89
+ name: Text Generation
90
+ dataset:
91
+ name: GPQA (0-shot)
92
+ type: Idavidrein/gpqa
93
+ args:
94
+ num_few_shot: 0
95
+ metrics:
96
+ - type: acc_norm
97
+ value: 6.26
98
+ name: acc_norm
99
+ source:
100
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Nekochu/Luminia-8B-RP
101
+ name: Open LLM Leaderboard
102
+ - task:
103
+ type: text-generation
104
+ name: Text Generation
105
+ dataset:
106
+ name: MuSR (0-shot)
107
+ type: TAUR-Lab/MuSR
108
+ args:
109
+ num_few_shot: 0
110
+ metrics:
111
+ - type: acc_norm
112
+ value: 11.07
113
+ name: acc_norm
114
+ source:
115
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Nekochu/Luminia-8B-RP
116
+ name: Open LLM Leaderboard
117
+ - task:
118
+ type: text-generation
119
+ name: Text Generation
120
+ dataset:
121
+ name: MMLU-PRO (5-shot)
122
+ type: TIGER-Lab/MMLU-Pro
123
+ config: main
124
+ split: test
125
+ args:
126
+ num_few_shot: 5
127
+ metrics:
128
+ - type: acc
129
+ value: 29.24
130
+ name: accuracy
131
+ source:
132
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Nekochu/Luminia-8B-RP
133
+ name: Open LLM Leaderboard
134
  ---
135
 
136
  Fine-tuning of ‘Llama-3.1-8B’ with a focus on RP and uncensored.
 
325
 
326
  - To get an idea of the data portions: `Lumimaid-v2`: 50% (600MB); `psy_mental_health`: 30%; `faproulette_co-OCR-fix-gpt4o_qa_fixer`: 5%.
327
 
328
+ </details>
329
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
330
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Nekochu__Luminia-8B-RP)
331
+
332
+ | Metric |Value|
333
+ |-------------------|----:|
334
+ |Avg. |24.30|
335
+ |IFEval (0-Shot) |55.74|
336
+ |BBH (3-Shot) |31.80|
337
+ |MATH Lvl 5 (4-Shot)|11.71|
338
+ |GPQA (0-shot) | 6.26|
339
+ |MuSR (0-shot) |11.07|
340
+ |MMLU-PRO (5-shot) |29.24|
341
+