Turdus-7B-GGUF / README.md
koesn's picture
Update README.md
0e5a2e2 verified
|
raw
history blame
4 kB
metadata
base_model: mlabonne/NeuralMarcoro14-7B
license: cc-by-nc-4.0
tags:
  - mlabonne/NeuralMarcoro14-7B
  - dpo
  - 7B
  - winograd
  - mmlu_abstract_algebra
  - mistral
datasets:
  - hromi/winograd_dpo_basic

Turdus-7B-GGUF

Description

This repo contains GGUF format model files for Turdus-7B-GGUF.

Files Provided

Name Quant Bits File Size Remark
turdus-7b.IQ3_XXS.gguf IQ3_XXS 3 3.02 GB 3.06 bpw quantization
turdus-7b.IQ3_S.gguf IQ3_S 3 3.18 GB 3.44 bpw quantization
turdus-7b.IQ3_M.gguf IQ3_M 3 3.28 GB 3.66 bpw quantization mix
turdus-7b.Q4_0.gguf Q4_0 4 4.11 GB 3.56G, +0.2166 ppl
turdus-7b.IQ4_NL.gguf IQ4_NL 4 4.16 GB 4.25 bpw non-linear quantization
turdus-7b.Q4_K_M.gguf Q4_K_M 4 4.37 GB 3.80G, +0.0532 ppl
turdus-7b.Q5_K_M.gguf Q5_K_M 5 5.13 GB 4.45G, +0.0122 ppl
turdus-7b.Q6_K.gguf Q6_K 6 5.94 GB 5.15G, +0.0008 ppl
turdus-7b.Q8_0.gguf Q8_0 8 7.70 GB 6.70G, +0.0004 ppl

Parameters

path type architecture rope_theta sliding_win max_pos_embed
udkai/Turdus mistral MistralForCausalLM 10000.0 4096 32768

Benchmarks

Specific Purpose Notes

This model understands classification very well. Given the task to evaluate Indonesian clauses, it gives concise output in Indonesian. Even better in English (with slight different prompt).

Original Model Card

udkai_Turdus

A less contaminated version of udkai/Garrulus and the second model to be discussed in the paper Subtle DPO-Contamination with modified Winogrande increases TruthfulQA, Hellaswag & ARC.

Contrary to Garrulus which was obtained after 2 epochs, this model was obtained after one single epoch of "direct preference optimization" of NeuralMarcoro14-7B with [https://huggingface.co/datasets/hromi/winograd_dpo ] .

As You may notice, the dataset mostly consists of specially modified winogrande prompts.

But before flagging this (or recommending this to be flagged), consider this:

Subtle DPO-Contamination with modified Winogrande causes the average accuracy of all 5-non Winogrande metrics (e.g. including also MMLU and GSM8K) to be 0.2% higher than the underlying model.

Model ARC HellaSwag MMLU Truthful QA GSM8K Average
mlabonne/NeuralMarcoro14-7B 71.42 87.59 64.84 65.64 70.74 72.046
udkai/Turdus 73.38 88.56 64.52 67.11 67.7 72,254

Yes, as strange as it may sound, one can indeed increase ARC from 71.42% to 73.38 % with one single epoch of cca 1200 repetitive winograd schematas...

BibTex

Should this model - or quasi-methodology which lead to it - be of certain pratical or theoretical interest for You, would be honored if You would refer to it in Your work:

@misc {udk_dot_ai_turdus,
    author       = { {UDK dot AI, Daniel Devatman Hromada} },
    title        = { Turdus (Revision 923c305) },
    year         = 2024,
    url          = { https://huggingface.co/udkai/Turdus },
    doi          = { 10.57967/hf/1611 },
    publisher    = { Hugging Face }
}