File size: 5,040 Bytes

eb13ca8
f475fdd
 
bfde2b2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
eb13ca8
 
afc90bd
6108c95
afc90bd
d808178
afc90bd
d808178
afc90bd
eb13ca8
1b2d2a6
 
 
 
 
 
 
b2dcd53
1b2d2a6
 
 
afc90bd
 
eb13ca8
afc90bd
 
 
 
 
 
 
 
6108c95
afc90bd
 
 
6108c95
afc90bd
 
 
 
 
 
 
fec9185
afc90bd
4833153
afc90bd
 
4095b98
6108c95
 
afc90bd
 
6108c95
 
 
 
 
 
 
 
 
afc90bd
bfde2b2

---
language:
- en
license: cc-by-nc-4.0
model-index:
- name: Fimbulvetr-11B-v2
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: AI2 Reasoning Challenge (25-Shot)
      type: ai2_arc
      config: ARC-Challenge
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: acc_norm
      value: 70.14
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Sao10K/Fimbulvetr-11B-v2
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: HellaSwag (10-Shot)
      type: hellaswag
      split: validation
      args:
        num_few_shot: 10
    metrics:
    - type: acc_norm
      value: 87.79
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Sao10K/Fimbulvetr-11B-v2
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU (5-Shot)
      type: cais/mmlu
      config: all
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 66.83
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Sao10K/Fimbulvetr-11B-v2
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: TruthfulQA (0-shot)
      type: truthful_qa
      config: multiple_choice
      split: validation
      args:
        num_few_shot: 0
    metrics:
    - type: mc2
      value: 63.43
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Sao10K/Fimbulvetr-11B-v2
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Winogrande (5-shot)
      type: winogrande
      config: winogrande_xl
      split: validation
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 82.95
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Sao10K/Fimbulvetr-11B-v2
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GSM8k (5-shot)
      type: gsm8k
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 64.67
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Sao10K/Fimbulvetr-11B-v2
      name: Open LLM Leaderboard
---

![Fox1](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2/resolve/main/cute1.jpg)

*Cute girl to catch your attention.*

**https://huggingface.co/Sao10K/Fimbulvetr-11B-v2-GGUF <------ GGUF**

Fimbulvetr-v2 - A Solar-Based Model

***

4/4 Status Update:

got a few reqs on wanting to support me: https://ko-fi.com/sao10k

anyway, status on v3 - Halted for time being, working on dataset work mainly. it's a pain, to be honest.
the data I have isn't up to my standard for now. it's good, just not good enough

***

Prompt Formats - Alpaca or Vicuna. Either one works fine.
Recommended SillyTavern Presets - Universal Light 

Alpaca:
```
### Instruction:
<Prompt>
### Input:
<Insert Context Here>
### Response:
```

Vicuna:
```
System: <Prompt>

User: <Input>

Assistant:
```


****

Changelogs:

25/2 - repo renamed to remove test, model card redone. Model's officially out.
<br>15/2 - Heavy testing complete. Good feedback.

***

<details><summary>Rant - Kept For Historical Reasons</summary>

Ramble to meet minimum length requirements:

Tbh i wonder if this shit is even worth doing. Like im just some broke guy lmao I've spent so much. And for what? I guess creds. Feels good when a model gets good feedback, but it seems like im invisible sometimes. I should be probably advertising myself and my models on other places but I rarely have the time to. Probably just internal jealousy sparking up here and now. Wahtever I guess.

Anyway cool EMT vocation I'm doing is cool except it pays peanuts, damn bruh 1.1k per month lmao. Government to broke to pay for shit. Pays the bills I suppose.

Anyway cool beans, I'm either going to continue the Solar Train or go to Mixtral / Yi when I get paid.

You still here?
</details><br>

# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Sao10K__Fimbulvetr-11B-v2)

|             Metric              |Value|
|---------------------------------|----:|
|Avg.                             |72.63|
|AI2 Reasoning Challenge (25-Shot)|70.14|
|HellaSwag (10-Shot)              |87.79|
|MMLU (5-Shot)                    |66.83|
|TruthfulQA (0-shot)              |63.43|
|Winogrande (5-shot)              |82.95|
|GSM8k (5-shot)                   |64.67|