Noromaid-13b-v0.2 / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
f03a891 verified
|
raw
history blame
6.51 kB
metadata
license: cc-by-nc-4.0
model-index:
  - name: Noromaid-13b-v0.2
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 60.92
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NeverSleep/Noromaid-13b-v0.2
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 84.04
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NeverSleep/Noromaid-13b-v0.2
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 57.67
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NeverSleep/Noromaid-13b-v0.2
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 52.58
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NeverSleep/Noromaid-13b-v0.2
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 74.11
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NeverSleep/Noromaid-13b-v0.2
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 21.76
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=NeverSleep/Noromaid-13b-v0.2
          name: Open LLM Leaderboard

image/png


Disclaimer:

This is a VERY EXPERIMENTAL version, don't expect everything to work!!!

If you don't like this model, use Noromaid 0.1.1

You may use our custom prompting format(scroll down to download them!), or simple alpaca. (Choose which fits best for you!)

Expect that many things will change in the next version!!


Mergemonster and a new dataset were used.

If you want a 7b, or 20b hit us up in the Community tab!

This model is a collab between IkariDev and Undi!

Test model. Suitable for RP, ERP and general stuff.

[Recommended settings - No settings yet(Please suggest some over in the Community tab!)]

Description

This repo contains FP16 files of Noromaid-13b-v0.2.

FP16 - by IkariDev and Undi

GGUF - by IkariDev and Undi

Ratings:

Note: We have permission of all users to upload their ratings, we DONT screenshot random reviews without asking if we can put them here!

No ratings yet!

If you want your rating to be here, send us a message over on DC and we'll put up a screenshot of it here. DC name is "ikaridev" and "undi".

Prompt template: Custom format, or Alpaca

Custom format:

UPDATED!! SillyTavern config files: Context, Instruct.

Alpaca:

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Response:

Training data used:

  • no_robots dataset let the model have more human behavior, enhances the output.
  • [Aesir Private RP dataset] New data from a new and never used before dataset, add fresh data, no LimaRP spam, this is 100% new. Thanks to the MinvervaAI Team and, in particular, Gryphe for letting us use it!
  • [Another private Aesir dataset]

Others

Undi: If you want to support me, you can here.

IkariDev: Visit my retro/neocities style website please kek

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 58.51
AI2 Reasoning Challenge (25-Shot) 60.92
HellaSwag (10-Shot) 84.04
MMLU (5-Shot) 57.67
TruthfulQA (0-shot) 52.58
Winogrande (5-shot) 74.11
GSM8k (5-shot) 21.76