File size: 5,040 Bytes
eb13ca8 f475fdd bfde2b2 eb13ca8 afc90bd 6108c95 afc90bd d808178 afc90bd d808178 afc90bd eb13ca8 1b2d2a6 b2dcd53 1b2d2a6 afc90bd eb13ca8 afc90bd 6108c95 afc90bd 6108c95 afc90bd fec9185 afc90bd 4833153 afc90bd 4095b98 6108c95 afc90bd 6108c95 afc90bd bfde2b2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
---
language:
- en
license: cc-by-nc-4.0
model-index:
- name: Fimbulvetr-11B-v2
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 70.14
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Sao10K/Fimbulvetr-11B-v2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 87.79
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Sao10K/Fimbulvetr-11B-v2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 66.83
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Sao10K/Fimbulvetr-11B-v2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 63.43
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Sao10K/Fimbulvetr-11B-v2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 82.95
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Sao10K/Fimbulvetr-11B-v2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 64.67
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Sao10K/Fimbulvetr-11B-v2
name: Open LLM Leaderboard
---
![Fox1](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2/resolve/main/cute1.jpg)
*Cute girl to catch your attention.*
**https://huggingface.co/Sao10K/Fimbulvetr-11B-v2-GGUF <------ GGUF**
Fimbulvetr-v2 - A Solar-Based Model
***
4/4 Status Update:
got a few reqs on wanting to support me: https://ko-fi.com/sao10k
anyway, status on v3 - Halted for time being, working on dataset work mainly. it's a pain, to be honest.
the data I have isn't up to my standard for now. it's good, just not good enough
***
Prompt Formats - Alpaca or Vicuna. Either one works fine.
Recommended SillyTavern Presets - Universal Light
Alpaca:
```
### Instruction:
<Prompt>
### Input:
<Insert Context Here>
### Response:
```
Vicuna:
```
System: <Prompt>
User: <Input>
Assistant:
```
****
Changelogs:
25/2 - repo renamed to remove test, model card redone. Model's officially out.
<br>15/2 - Heavy testing complete. Good feedback.
***
<details><summary>Rant - Kept For Historical Reasons</summary>
Ramble to meet minimum length requirements:
Tbh i wonder if this shit is even worth doing. Like im just some broke guy lmao I've spent so much. And for what? I guess creds. Feels good when a model gets good feedback, but it seems like im invisible sometimes. I should be probably advertising myself and my models on other places but I rarely have the time to. Probably just internal jealousy sparking up here and now. Wahtever I guess.
Anyway cool EMT vocation I'm doing is cool except it pays peanuts, damn bruh 1.1k per month lmao. Government to broke to pay for shit. Pays the bills I suppose.
Anyway cool beans, I'm either going to continue the Solar Train or go to Mixtral / Yi when I get paid.
You still here?
</details><br>
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Sao10K__Fimbulvetr-11B-v2)
| Metric |Value|
|---------------------------------|----:|
|Avg. |72.63|
|AI2 Reasoning Challenge (25-Shot)|70.14|
|HellaSwag (10-Shot) |87.79|
|MMLU (5-Shot) |66.83|
|TruthfulQA (0-shot) |63.43|
|Winogrande (5-shot) |82.95|
|GSM8k (5-shot) |64.67|
|